Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-7054495

Faster methods for converting bytes to/from bigger primitives

XMLWordPrintable

    • Icon: Enhancement Enhancement
    • Resolution: Unresolved
    • Icon: P4 P4
    • None
    • 7
    • core-libs

      A DESCRIPTION OF THE REQUEST :
      The faster methods to convert a contiguous range of bytes of an array to a bigger primitive such as int, as of Java 6, are ByteBuffer.get*() and ByteBuffer.put*() methods. These methods are virtual, support different endian modes and use bitwise operations to build a bigger primitive out of bytes. Because of all that, they are at least three orders of magnitude slower than following C/C++ code:

          BYTE[COUNT] data;
          ... // initialize data
          intValue = *(int*)(&data[offset]); // get operation
          *(int*)(&data[offset]) = intValue; // put operation

      This prevents Java from being used to build portable and fast interpreters, streaming applications and database engines.

      We need methods that are at most one order of magnitude slower, since we cannot have unsafe behavior of C/C++. My suggestion is to add the following methods to java.lang.System:

          // getters
          public static native short getNativeShort(byte[] data, int offset);
          public static native int getNativeInt(byte[] data, int offset);
          public static native long getNativeLong(byte[] data, int offset);
          public static native float getNativeFloat(byte[] data, int offset);
          public static native double getNativeDouble(byte[] data, int offset);

          // setters
          public static native void setNativeShort(byte[] data, int offset, byte value);
          public static native void setNativeInt(byte[] data, int offset, short value);
          public static native void setNativeLong(byte[] data, int offset, int value);
          public static native void setNativeFloat(byte[] data, int offset, float value);
          public static native void setNativeDouble(byte[] data, int offset, double value);

      These new methods perform the expected range checking, but they also use the platform's endian mode and are implemented internally using a simple typecast, thus approaching C/C++ performance without breaking language safety.

      It's also very easy for the JVM to inline theses methods, which further enhances performance.

      JUSTIFICATION :
      The world needs to process data in dynamic ways, so that class structure cannot be always predicted at development time, and using classloaders to generate classes at runtime is cumbersome and not well managed by the platform (classes are heavy platform objects and they leak more easily).

      Also, the world deals with a huge number of binary protocols and file formats, and some of them are too complex to handle with DataInput/DataOutput or too slow to handle with ByteBuffer, as mentioned at description.

      Finally, some softwares like interpreters or database engines require fast and dynamic interpretation of bytes in a given array. Without such features, most of these software becomes impractible to be implemented in Java.

      EXPECTED VERSUS ACTUAL BEHAVIOR :
      EXPECTED -
      Please, see description.
      ACTUAL -
      Please, see description.

            Unassigned Unassigned
            webbuggrp Webbug Group
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated:
              Imported:
              Indexed: