-
Enhancement
-
Resolution: Unresolved
-
P4
-
23
This came up in this email thread:
https://mail.openjdk.org/pipermail/panama-dev/2024-March/020320.html
There are several benchmarks discussed there but this bug applies to:
AddBenchmark.scalarArrayArray avgt 5 198.951 ± 1.078 ns/op
AddBenchmark.scalarUnsafeArray avgt 5 133.374 ± 5.114 ns/op
AddBenchmark.unrolledArrayArray avgt 5 580.045 ± 11.589 ns/op
AddBenchmark.unrolledUnsafeArray avgt 5 247.511 ± 2.528 ns/op
The unrolled* benchmarks are hand unrolled versions of the scalar* ones.
scalarArrayArray vectorizes but unrolledArrayArray doesn't. unrolledArrayArray vectorizes if I bump LoopUnrollLimit but then it doesn't get unrolled as much as scalarArrayArray. unrolledUnsafeArray vectorizes but is not unrolled as much as scalarUnsafeArray.
It seems wrong that hand unrolled loops don't compile down to the same code as loops unrolled by the compiler.
https://mail.openjdk.org/pipermail/panama-dev/2024-March/020320.html
There are several benchmarks discussed there but this bug applies to:
AddBenchmark.scalarArrayArray avgt 5 198.951 ± 1.078 ns/op
AddBenchmark.scalarUnsafeArray avgt 5 133.374 ± 5.114 ns/op
AddBenchmark.unrolledArrayArray avgt 5 580.045 ± 11.589 ns/op
AddBenchmark.unrolledUnsafeArray avgt 5 247.511 ± 2.528 ns/op
The unrolled* benchmarks are hand unrolled versions of the scalar* ones.
scalarArrayArray vectorizes but unrolledArrayArray doesn't. unrolledArrayArray vectorizes if I bump LoopUnrollLimit but then it doesn't get unrolled as much as scalarArrayArray. unrolledUnsafeArray vectorizes but is not unrolled as much as scalarUnsafeArray.
It seems wrong that hand unrolled loops don't compile down to the same code as loops unrolled by the compiler.