Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8294509

The sign extension bug applies to 'public static int[] convertSeedBytesToInts(byte[] seed, int n, int z)' in RandomSupport

XMLWordPrintable

    • b18
    • generic
    • generic
    • Verified

        A DESCRIPTION OF THE PROBLEM :
        This is related to bug JDK-8282144. Please see https://bugs.openjdk.org/browse/JDK-8282144?focusedCommentId=14482366&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14482366 for more details.

        The method attempts to create an array of longs by consuming the input bytes most significant bit first. New bytes are appended to the existing long using the OR operator on the signed byte. Due to sign extension this will overwrite all the existing bits from 63 to 8 if the next byte is negative.


        STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
        The bug will not readily manifest as a poorly seeded RNG with many of the bits in the generated int[] set to 1 quickly recovers and outputs a random sequence. Reproducing cannot be done without access to the internal class RandomSupport.

        I have extracted the part of the class with the bug into a driver program:

        public class RandomSupportInts {
          private static final char[] HEX_DIGITS =
              {'0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'a', 'b', 'c', 'd', 'e', 'f'};

          public static int[] convertSeedBytesToInts(byte[] seed, int n, int z) {
            final int[] result = new int[n];
            final int m = Math.min(seed.length, n << 2);
            // Distribute seed bytes into the words to be formed.
            for (int j = 0; j < m; j++) {
              result[j>>2] = (result[j>>2] << 8) | seed[j];
            }
            // Filling the rest of the int[] has been removed for brevity
            return result;
          }

          public static int[] convertSeedBytesToIntsFixed(byte[] seed, int n, int z) {
            final int[] result = new int[n];
            final int m = Math.min(seed.length, n << 2);
            // Distribute seed bytes into the words to be formed.
            for (int j = 0; j < m; j++) {
              result[j>>2] = (result[j>>2] << 8) | (seed[j] & 0xff);
            }
            // Filling the rest of the int[] has been removed for brevity
            return result;
          }

          public static void main(String[] args) {
            RandomGenerator rng = RandomGeneratorFactory.of("L64X128MixRandom").create(42);
            for (int i = 1; i < Integer.BYTES; i++) {
              byte[] seed = new byte[i];
              for (int j = 0; j < 10; j++) {
                rng.nextBytes(seed);

                for (byte b : seed) {
                  System.out.printf("%c%c", HEX_DIGITS[(b & 0xf0) >> 4], HEX_DIGITS[b & 0xf]);
                }
                System.out.printf(" %-16s %-16s%n",
                    Integer.toHexString(convertSeedBytesToInts(seed, 1, 1)[0]),
                    Integer.toHexString(convertSeedBytesToIntsFixed(seed, 1, 1)[0]));
              }
            }
          }
        }


        STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
        Compile and run the driver program:

        > javac RandomSupportInts.java
        > java RandomSupportInts


        EXPECTED VERSUS ACTUAL BEHAVIOR :
        EXPECTED -
        Column: Description
        1: Hex representation of the input byte[], most significant bits first
        2: Current long output (hex representation)
        3: Corrected long output (hex representation)

        12 12 12
        10 10 10
        38 38 38
        ec ffffffec ec
        8a ffffff8a 8a
        7e 7e 7e
        cf ffffffcf cf
        12 12 12
        11 11 11
        d4 ffffffd4 d4
        7ae2 ffffffe2 7ae2
        47c8 ffffffc8 47c8
        3cc0 ffffffc0 3cc0
        2475 2475 2475
        fdce ffffffce fdce
        7c04 7c04 7c04
        16a4 ffffffa4 16a4
        0bf1 fffffff1 bf1
        cb22 ffffcb22 cb22
        63eb ffffffeb 63eb
        6d72b7 ffffffb7 6d72b7
        ddd3a6 ffffffa6 ddd3a6
        6194a2 ffffffa2 6194a2
        d54bc0 ffffffc0 d54bc0
        3142f5 fffffff5 3142f5
        643056 643056 643056
        ff94ba ffffffba ff94ba
        86923e ffff923e 86923e
        faf8f0 fffffff0 faf8f0
        b69017 ffff9017 b69017
        ACTUAL -
        See expected result. The int created by the current code has many bits of the original input bytes overwritten with 1s (the output hex representation contains a lot of leading f characters). This results in loss of information in byte[] seeds.

        CUSTOMER SUBMITTED WORKAROUND :
        None. Currently any generator created using a random byte[] will not be seeded with the full randomness of the bytes. Using a random byte[] will generate an int[] array for seeding where only the least significant 8-bits of each long are ensured to be random. The more significant bits have an increasing chance of being set to 1s. This is 1 - 1/(2^3) = 0.875 for the highest 8-bits in the int if 4 bytes were used to create it (as it has 3 chances to overwrite the bits as they are shifted up).

        FREQUENCY : always


              rgiulietti Raffaello Giulietti
              webbuggrp Webbug Group
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Created:
                Updated:
                Resolved: