Random value distribution of os::next_random is pretty bad. This little test:
```
TEST_VM(os, rand_dist) {
int buckets[64];
constexpr unsigned num_buckets = sizeof(buckets)/sizeof(buckets[1]);
constexpr size_t total_range = (4 * G);
constexpr unsigned range_bucket = (unsigned)(total_range/num_buckets);
for (unsigned i = 0; i < 10000000; i++) {
unsigned x = (unsigned)os::random();
unsigned y = x / range_bucket;
buckets[y] ++;
}
for (unsigned i = 0; i < num_buckets; i++) {
tty->print("%d ", buckets[i]);
}
int largest = 0;
for (unsigned i = 0; i < num_buckets; i++) {
largest = MAX2(largest, buckets[i]);
}
constexpr int num_lines = 16;
const int step = largest / num_lines;
for (int line = 0; line < num_lines; line ++) {
tty->cr();
const int threshold = largest - (line * step);
for (unsigned i = 0; i < num_buckets; i++) {
tty->print("%c ", buckets[i] > threshold ? 'X' : ' ');
}
}
}
```
Shows, for 10 mio values:
```
1804613236 312218 -1971345340 1248903260 313243 313064 -228130408 312236 312027 312940 312756 312965 814028451 312553 311926 312095 814007330 313399 1923859592 300131575 54085151 17113941 312838 313075 312041 312827 -1781999222 431490 1804613540 312669 312917 312017 1804300624 1 -1971635044 84606977 0 0 0 0 1 0 -1782311381 118933 53772288 24576 1804300832 1 1804300816 1 -1971604812 -753893375 1716799706 0 253119 0 150085632 1 814760336 1 150089176 1 0 0
```
Very spiky, strong emphasis for lower values. Interestingly enough, spikes are also somewhat independent from seed, e.g. we will always see a lot of values in the lowest bucket.
Consequences:
- this mostly affects gtests, and some other parts of the JVM. Note that I have not tested if this affects ihashes. ihash RNG seeds are generated with os::random, so its seed quality suffers; OTOH, their RNG is different from then on.
```
TEST_VM(os, rand_dist) {
int buckets[64];
constexpr unsigned num_buckets = sizeof(buckets)/sizeof(buckets[1]);
constexpr size_t total_range = (4 * G);
constexpr unsigned range_bucket = (unsigned)(total_range/num_buckets);
for (unsigned i = 0; i < 10000000; i++) {
unsigned x = (unsigned)os::random();
unsigned y = x / range_bucket;
buckets[y] ++;
}
for (unsigned i = 0; i < num_buckets; i++) {
tty->print("%d ", buckets[i]);
}
int largest = 0;
for (unsigned i = 0; i < num_buckets; i++) {
largest = MAX2(largest, buckets[i]);
}
constexpr int num_lines = 16;
const int step = largest / num_lines;
for (int line = 0; line < num_lines; line ++) {
tty->cr();
const int threshold = largest - (line * step);
for (unsigned i = 0; i < num_buckets; i++) {
tty->print("%c ", buckets[i] > threshold ? 'X' : ' ');
}
}
}
```
Shows, for 10 mio values:
```
1804613236 312218 -1971345340 1248903260 313243 313064 -228130408 312236 312027 312940 312756 312965 814028451 312553 311926 312095 814007330 313399 1923859592 300131575 54085151 17113941 312838 313075 312041 312827 -1781999222 431490 1804613540 312669 312917 312017 1804300624 1 -1971635044 84606977 0 0 0 0 1 0 -1782311381 118933 53772288 24576 1804300832 1 1804300816 1 -1971604812 -753893375 1716799706 0 253119 0 150085632 1 814760336 1 150089176 1 0 0
```
Very spiky, strong emphasis for lower values. Interestingly enough, spikes are also somewhat independent from seed, e.g. we will always see a lot of values in the lowest bucket.
Consequences:
- this mostly affects gtests, and some other parts of the JVM. Note that I have not tested if this affects ihashes. ihash RNG seeds are generated with os::random, so its seed quality suffers; OTOH, their RNG is different from then on.
- relates to
-
JDK-8329968 os::random should be random
- Closed