Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8207389

DataInputStream.read() and DataInputStream.read(byte[]) behave differently

XMLWordPrintable

    • x86_64
    • windows_10

      ADDITIONAL SYSTEM INFORMATION :
      OS: Windows 10 64 bit
      I first found the error while using java an earlier release of 1.8, but have since updated my jdk and jre to release 1.8.0_172 and the bug persists.
      I'm using Eclipse (Oxygen version 4.7.2)

      A DESCRIPTION OF THE PROBLEM :
      This code illustrates how DataInputStream.read(byte[]) fails to properly read zipped entries while DataInputStream.read() (in a loop) succeeds. A link to the file that the code operates on is provided.

      ftp://ftp2.census.gov/geo/tiger/TIGER2017/TABBLOCK/tl_2017_09_tabblock10.zip
      entry: tl_2017_09_tabblock10.dbf

      I pasted in source code below that should reproduce the bug. Note that path name will need to be altered to suit local environment. Also note that if another file is used, the the variable "fileSize" is hardcoded and would need to be altered to suit as well.

      STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
      1. Download this file from Census
      ftp://ftp2.census.gov/geo/tiger/TIGER2017/TABBLOCK/tl_2017_09_tabblock10.zip

      2. unpack the zipped archive

      3. In the code (provided below), alter the following line to reflect the location of the compressed and uncompressed files you just downloaded.

      String zipFilePath ="\\\\mfs2\\GuseJ-Research\\ESRI Block Data\\tl_2017_09_tabblock10.zip";
      String uncompressedFilePath = "\\\\mfs2\\GuseJ-Research\\ESRI Block Data\\tl_2017_09_tabblock10\\tl_2017_09_tabblock10.dbf";

      4. compile and run. ( I did this from within Eclipse)

      EXPECTED VERSUS ACTUAL BEHAVIOR :
      EXPECTED -
      Comparing dis.read(byte[]) with dis.read() with Zippedfile
      num read: 7096203
      total agreement: true

      Comparing dis.read(byte[]) with dis.read() with Uncompressedfile
      num read: 7096203
      total agreement: true
      ACTUAL -
      Comparing dis.read(byte[]) with dis.read() with Zippedfile
      num read: 30199
      Disagreement at : 30199
      allBytesAtOnce[30199]: 0
      allBytesOneAtATime[30199]: 51
      total agreement: false

      Comparing dis.read(byte[]) with dis.read() with Uncompressedfile
      num read: 7096203
      total agreement: true

      ---------- BEGIN SOURCE ----------
      package esri;

      import java.io.DataInputStream;
      import java.io.FileInputStream;
      import java.io.IOException;
      import java.io.InputStream;
      import java.util.zip.ZipFile;

      public class DISBugReport {


      /**
      * This method is written to illustrate a bug
      * in which DataInputStream.read(byte[]) fails while DataInputStream.read()
      * succeeds in reading the same file.
      *
      * The failure only seems to happen when the DataInputStream is derived
      * from from a zipped archive entry. To illustrate this, t
      * the argument to the method determine whether a zipped entry generates
      * the input stream or a regular file (FileInputStream) does.
      *
      * @param fromZipped if set to true the DataInputStream will sit on top of a zipped entry
      * @throws IOException
      */
      public static void checkBytes(boolean fromZipped) throws IOException{

      System.out.println("\nComparing dis.read(byte[]) with dis.read() with "+ (fromZipped?"Zipped":"Uncompressed") +"file");

      // Path and file size information for a file that reveals the bug
      // to test on another file replace directory and fileName
      // and make sure fileSize is equal to at most the size (measured in bytes)
      // of the file to be tested in bytes.

      // publicly available file from the Census here:
      // ftp://ftp2.census.gov/geo/tiger/TIGER2017/TABBLOCK/tl_2017_09_tabblock10.zip
      String entryName = "tl_2017_09_tabblock10.dbf"; // publicly available file from the Census
      String zipFilePath ="\\\\mfs2\\GuseJ-Research\\ESRI Block Data\\tl_2017_09_tabblock10.zip";
      int fileSize = 7096203;
      String uncompressedFilePath = "\\\\mfs2\\GuseJ-Research\\ESRI Block Data\\tl_2017_09_tabblock10\\tl_2017_09_tabblock10.dbf";

      // get data input stream either from zipped or uncompressed version
      // of the same file depending on "fromZipped" flag.
      // Note bug is ONLY revealed on data stream
      // derived from a zipped entry.
      DataInputStream dis;
      if(fromZipped) dis = getDataStreamFromZippedEntry(zipFilePath, entryName);
      else dis = new DataInputStream(new FileInputStream(uncompressedFilePath));

      // we will attempt to populate two byte[] arrays
      // with the entire contents of the file
      // the first ("allBytesAtOnce") will be populated using DataInputStream.read(byte[])
      // the second ("allBytesOneAtATime") will be populated using DataInputStream.read()

      // USING DataInputStream.read(byte[])
      // download the whole file into a byte array
      // and close streams
      byte[] allBytesAtOnce = new byte[fileSize];
      int numRead = dis.read(allBytesAtOnce);
      System.out.println("num read: " + numRead);
      dis.close();

      // USING DataInputStream.read()
      // do it again, reading one byte at a time
      if(fromZipped) dis = getDataStreamFromZippedEntry(zipFilePath, entryName);
      else dis = new DataInputStream(new FileInputStream(uncompressedFilePath));
      byte[] allBytesOneAtATime = new byte[fileSize];
      for(int i = 0; i < fileSize; i++) {
      allBytesOneAtATime[i] = (byte)dis.read();
      }
      dis.close();

      // Compare the two byte arrays:
      // allBytesAtOnce was populated with a call to DataInputStream.read(byte[])
      // allBytesOneAtATime was populated with successive calls to DataInputStream.read()
      boolean totalAgreement = true;
      for(int i = 0; i < fileSize; i++) {
      if(allBytesAtOnce[i] != allBytesOneAtATime[i]) {
      System.out.println("Disagreement at : " + i);
      System.out.println("allBytesAtOnce["+i+"]: "+ allBytesAtOnce[i]);
      System.out.println("allBytesOneAtATime["+i+"]: "+ allBytesOneAtATime[i]);
      totalAgreement = false;
      break;
      }
      }
      System.out.println("total agreement: " + totalAgreement);
      }

      public static DataInputStream getDataStreamFromZippedEntry(String zipFilePath, String entryName) throws IOException {
      // get data input stream from zip file
      ZipFile zf = new ZipFile(zipFilePath);
      InputStream is = zf.getInputStream(zf.getEntry(entryName));
      return new DataInputStream(is);
      }

      public static void main(String[] args) throws IOException {

      // in this test the DataInputStream will derive from a zipped file entry
      checkBytes(true);

      // in this test the DataInputStream will derive from an uncompressed file
      checkBytes(false);
      }
      }
      ---------- END SOURCE ----------

      CUSTOMER SUBMITTED WORKAROUND :
      use DataInputStream.read() in a loop instead of DataInputStream.read(byte[])

      FREQUENCY : often


            Unassigned Unassigned
            webbuggrp Webbug Group
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: