Details
-
Bug
-
Resolution: Won't Fix
-
P3
-
18, 19, 21, 22
-
aarch64
-
generic
Description
StringLatin1.indexOf(byte[],int, byte[],int,int) has an intrinsic implementation, which is expected to run faster than the equivalent Java code.
JMH benchmarks, however, reveal that the the Java code is more than 10x faster.
On macOS with M1 CPU the results on JDK 19.0.2 are
Benchmark Mode Cnt Score Error Units
IndexOfStr.testIndexOf avgt 15 188411.351 ± 599.219 ns/op
With intrinsic disabled (export JDK_JAVA_OPTIONS='-XX:+UnlockDiagnosticVMOptions -XX:DisableIntrinsic=_indexOfIL') the outcome is
Benchmark Mode Cnt Score Error Units
IndexOfStr.testIndexOf avgt 15 11865.268 ± 37.313 ns/op
And here’s the benchmark
@Fork(3)
@Warmup(iterations = 5, time = 5, timeUnit = TimeUnit.SECONDS)
@Measurement(iterations = 5, time = 5, timeUnit = TimeUnit.SECONDS)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@State(Scope.Thread)
public class IndexOfStr {
private int length = 100_000;
private String z = "0".repeat(length / 2);
private String s = z + '1' + z;
private int begin = length / 3;
@Benchmark
public int testIndexOf() {
return s.indexOf("10000000000000000001", begin);
}
}
Analogously for StringUTF16.indexOf(byte[],int, byte[],int,int).
Similar outcomes have been measured on JDK 18 and JDK 21-ea, by two independent developers on two different machines.
The fields "Affected version", "CPU", and "OS" have been set to the ones where the benchmarks were run. The issue might affect other combinations as well.
JMH benchmarks, however, reveal that the the Java code is more than 10x faster.
On macOS with M1 CPU the results on JDK 19.0.2 are
Benchmark Mode Cnt Score Error Units
IndexOfStr.testIndexOf avgt 15 188411.351 ± 599.219 ns/op
With intrinsic disabled (export JDK_JAVA_OPTIONS='-XX:+UnlockDiagnosticVMOptions -XX:DisableIntrinsic=_indexOfIL') the outcome is
Benchmark Mode Cnt Score Error Units
IndexOfStr.testIndexOf avgt 15 11865.268 ± 37.313 ns/op
And here’s the benchmark
@Fork(3)
@Warmup(iterations = 5, time = 5, timeUnit = TimeUnit.SECONDS)
@Measurement(iterations = 5, time = 5, timeUnit = TimeUnit.SECONDS)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@State(Scope.Thread)
public class IndexOfStr {
private int length = 100_000;
private String z = "0".repeat(length / 2);
private String s = z + '1' + z;
private int begin = length / 3;
@Benchmark
public int testIndexOf() {
return s.indexOf("10000000000000000001", begin);
}
}
Analogously for StringUTF16.indexOf(byte[],int, byte[],int,int).
Similar outcomes have been measured on JDK 18 and JDK 21-ea, by two independent developers on two different machines.
The fields "Affected version", "CPU", and "OS" have been set to the ones where the benchmarks were run. The issue might affect other combinations as well.