Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-4121972

java.io.DataInputStream is very slow: Add array read methods for primitive types

XMLWordPrintable

    • Icon: Enhancement Enhancement
    • Resolution: Duplicate
    • Icon: P4 P4
    • None
    • 1.1.5, 1.1.6
    • core-libs
    • generic, sparc
    • generic, solaris_2.5.1

      Date: Fri, 27 Feb 1998 15:09:41 -0800
      From: bill massena <###@###.###>

      Our geometry application reads sizable files (multi-Mb) of binary geometry data
      consisting of integers and floats or doubles. I am finding that the Streams I
      am using are much, much slower than C or Fortran doing the same job.

      I have a test file of 500K floats. When I read it using a C program on my SGI
      R10000 machine, it takes .03 to .04 seconds. Using the code below

      FileInputStream fin;
      BufferedInputStream bin;
      FastInputStream din = null;
      fin = new FileInputStream("fil.bin");
      bin = new BufferedInputStream(fin);
      din = new FastInputStream(bin);
      for (int i=0; i<500000; i++) {
      data[i] = din.readFloat();
      }

      takes from 5 to 6 seconds. So the speed ratio is about 150-to-1!!!

      Looking at the DataInputStream source code, I find that in.read() is being used
      to get each byte. I read the data into a byte
      array, and used the conversion logic from the DataInputStream source to get:

      byte b[] = new byte[500000*4];
      din.readFully(b);
      int ib = 0;
      for (int i=0; i<NSIZ; i++) {
      data[i] = Float.intBitsToFloat(
      ((b[ib]&0xff) << 24) + ((b[ib+1]&0xff) << 16) +
      ((b[ib+2]&0xff) << 8) + (b[ib+3]&0xff));
      ib += 4;
      }

      This process takes about .5 seconds, which is still 16 times slower.

      I don't want to put file-reading logic like this in our code. Also, this is
      just a test case. It looks like time spent reading and writing binary files may
      go from a few seconds to many minutes for us. This is a big deal, because the
      first thing a prospective user of the application will do is read historical
      geometry data to begin evaluation.

      I would like to believe that file reading can be made much faster; that there is
      nothing inherent in Java which prevents this.

      I was surprised to find that there is a native method for reading an array of
      bytes, but nothing similar for ints, floats, and doubles. With methods like
      these, file reading would be much faster. Could you folks consider coming up
      with changes like this?

      Finally, I assume that everybody in the numerical analysis community would
      benefit from much greater I/O rates.

            mr Mark Reinhold
            mr Mark Reinhold
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated:
              Resolved:
              Imported:
              Indexed: