-
Enhancement
-
Resolution: Fixed
-
P4
-
None
-
b07
Currently it is implemented as
public static int highestOneBit(int i) {
// HD, Figure 3-1
i |= (i >> 1);
i |= (i >> 2);
i |= (i >> 4);
i |= (i >> 8);
i |= (i >> 16);
return i - (i >>> 1);
}
Alternative implementation may be
public static int highestOneBit(int i) {
return i == 0 ? 0 : MIN_VALUE >>> numberOfLeadingZeros(i);
}
Even though it contains a branch, a benchmark shows the performance improvement (+27% to the throughput):
Benchmark (arg) Mode Cnt Score Error Units
MyBenchmark.int_testMethod_new 0 thrpt 35 323430664.593 ± 7492044.171 ops/s
MyBenchmark.int_testMethod_new 42 thrpt 35 298526237.078 ± 5978291.689 ops/s
MyBenchmark.int_testMethod_new -42 thrpt 35 302903562.073 ± 7984723.721 ops/s
MyBenchmark.int_testMethod_org 0 thrpt 35 236245042.891 ± 3635990.596 ops/s
MyBenchmark.int_testMethod_org 42 thrpt 35 237903410.753 ± 3437684.390 ops/s
MyBenchmark.int_testMethod_org -42 thrpt 35 238472580.618 ± 2654886.010 ops/s
MyBenchmark.long_testMethod_new 0 thrpt 35 282646114.501 ± 48028366.305 ops/s
MyBenchmark.long_testMethod_new 42 thrpt 35 282382228.405 ± 5781529.307 ops/s
MyBenchmark.long_testMethod_new -42 thrpt 35 276724858.286 ± 6529561.227 ops/s
MyBenchmark.long_testMethod_org 0 thrpt 35 198500211.972 ± 15096862.367 ops/s
MyBenchmark.long_testMethod_org 42 thrpt 35 215854630.194 ± 3112930.563 ops/s
MyBenchmark.long_testMethod_org -42 thrpt 35 217992805.521 ± 2622877.082 ops/s
This may be due to the numberOfLeadingZeros() being intrinsified.
public static int highestOneBit(int i) {
// HD, Figure 3-1
i |= (i >> 1);
i |= (i >> 2);
i |= (i >> 4);
i |= (i >> 8);
i |= (i >> 16);
return i - (i >>> 1);
}
Alternative implementation may be
public static int highestOneBit(int i) {
return i == 0 ? 0 : MIN_VALUE >>> numberOfLeadingZeros(i);
}
Even though it contains a branch, a benchmark shows the performance improvement (+27% to the throughput):
Benchmark (arg) Mode Cnt Score Error Units
MyBenchmark.int_testMethod_new 0 thrpt 35 323430664.593 ± 7492044.171 ops/s
MyBenchmark.int_testMethod_new 42 thrpt 35 298526237.078 ± 5978291.689 ops/s
MyBenchmark.int_testMethod_new -42 thrpt 35 302903562.073 ± 7984723.721 ops/s
MyBenchmark.int_testMethod_org 0 thrpt 35 236245042.891 ± 3635990.596 ops/s
MyBenchmark.int_testMethod_org 42 thrpt 35 237903410.753 ± 3437684.390 ops/s
MyBenchmark.int_testMethod_org -42 thrpt 35 238472580.618 ± 2654886.010 ops/s
MyBenchmark.long_testMethod_new 0 thrpt 35 282646114.501 ± 48028366.305 ops/s
MyBenchmark.long_testMethod_new 42 thrpt 35 282382228.405 ± 5781529.307 ops/s
MyBenchmark.long_testMethod_new -42 thrpt 35 276724858.286 ± 6529561.227 ops/s
MyBenchmark.long_testMethod_org 0 thrpt 35 198500211.972 ± 15096862.367 ops/s
MyBenchmark.long_testMethod_org 42 thrpt 35 215854630.194 ± 3112930.563 ops/s
MyBenchmark.long_testMethod_org -42 thrpt 35 217992805.521 ± 2622877.082 ops/s
This may be due to the numberOfLeadingZeros() being intrinsified.