Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8294509

The sign extension bug applies to 'public static int[] convertSeedBytesToInts(byte[] seed, int n, int z)' in RandomSupport

XMLWordPrintable

    • b18
    • generic
    • generic
    • Verified

      A DESCRIPTION OF THE PROBLEM :
      This is related to bug JDK-8282144. Please see https://bugs.openjdk.org/browse/JDK-8282144?focusedCommentId=14482366&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14482366 for more details.

      The method attempts to create an array of longs by consuming the input bytes most significant bit first. New bytes are appended to the existing long using the OR operator on the signed byte. Due to sign extension this will overwrite all the existing bits from 63 to 8 if the next byte is negative.


      STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
      The bug will not readily manifest as a poorly seeded RNG with many of the bits in the generated int[] set to 1 quickly recovers and outputs a random sequence. Reproducing cannot be done without access to the internal class RandomSupport.

      I have extracted the part of the class with the bug into a driver program:

      public class RandomSupportInts {
        private static final char[] HEX_DIGITS =
            {'0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'a', 'b', 'c', 'd', 'e', 'f'};

        public static int[] convertSeedBytesToInts(byte[] seed, int n, int z) {
          final int[] result = new int[n];
          final int m = Math.min(seed.length, n << 2);
          // Distribute seed bytes into the words to be formed.
          for (int j = 0; j < m; j++) {
            result[j>>2] = (result[j>>2] << 8) | seed[j];
          }
          // Filling the rest of the int[] has been removed for brevity
          return result;
        }

        public static int[] convertSeedBytesToIntsFixed(byte[] seed, int n, int z) {
          final int[] result = new int[n];
          final int m = Math.min(seed.length, n << 2);
          // Distribute seed bytes into the words to be formed.
          for (int j = 0; j < m; j++) {
            result[j>>2] = (result[j>>2] << 8) | (seed[j] & 0xff);
          }
          // Filling the rest of the int[] has been removed for brevity
          return result;
        }

        public static void main(String[] args) {
          RandomGenerator rng = RandomGeneratorFactory.of("L64X128MixRandom").create(42);
          for (int i = 1; i < Integer.BYTES; i++) {
            byte[] seed = new byte[i];
            for (int j = 0; j < 10; j++) {
              rng.nextBytes(seed);

              for (byte b : seed) {
                System.out.printf("%c%c", HEX_DIGITS[(b & 0xf0) >> 4], HEX_DIGITS[b & 0xf]);
              }
              System.out.printf(" %-16s %-16s%n",
                  Integer.toHexString(convertSeedBytesToInts(seed, 1, 1)[0]),
                  Integer.toHexString(convertSeedBytesToIntsFixed(seed, 1, 1)[0]));
            }
          }
        }
      }


      STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
      Compile and run the driver program:

      > javac RandomSupportInts.java
      > java RandomSupportInts


      EXPECTED VERSUS ACTUAL BEHAVIOR :
      EXPECTED -
      Column: Description
      1: Hex representation of the input byte[], most significant bits first
      2: Current long output (hex representation)
      3: Corrected long output (hex representation)

      12 12 12
      10 10 10
      38 38 38
      ec ffffffec ec
      8a ffffff8a 8a
      7e 7e 7e
      cf ffffffcf cf
      12 12 12
      11 11 11
      d4 ffffffd4 d4
      7ae2 ffffffe2 7ae2
      47c8 ffffffc8 47c8
      3cc0 ffffffc0 3cc0
      2475 2475 2475
      fdce ffffffce fdce
      7c04 7c04 7c04
      16a4 ffffffa4 16a4
      0bf1 fffffff1 bf1
      cb22 ffffcb22 cb22
      63eb ffffffeb 63eb
      6d72b7 ffffffb7 6d72b7
      ddd3a6 ffffffa6 ddd3a6
      6194a2 ffffffa2 6194a2
      d54bc0 ffffffc0 d54bc0
      3142f5 fffffff5 3142f5
      643056 643056 643056
      ff94ba ffffffba ff94ba
      86923e ffff923e 86923e
      faf8f0 fffffff0 faf8f0
      b69017 ffff9017 b69017
      ACTUAL -
      See expected result. The int created by the current code has many bits of the original input bytes overwritten with 1s (the output hex representation contains a lot of leading f characters). This results in loss of information in byte[] seeds.

      CUSTOMER SUBMITTED WORKAROUND :
      None. Currently any generator created using a random byte[] will not be seeded with the full randomness of the bytes. Using a random byte[] will generate an int[] array for seeding where only the least significant 8-bits of each long are ensured to be random. The more significant bits have an increasing chance of being set to 1s. This is 1 - 1/(2^3) = 0.875 for the highest 8-bits in the int if 4 bytes were used to create it (as it has 3 chances to overwrite the bits as they are shifted up).

      FREQUENCY : always


            rgiulietti Raffaello Giulietti
            webbuggrp Webbug Group
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: