-
Bug
-
Resolution: Duplicate
-
P4
-
22, 23, 24
The performance for `MemorySegment::fill` is an order of magnitude slower on aarch64 platforms than on x64 platforms.
Once this has been fixed, `AbstractMemorySegmentImpl.java.FILL_NATIVE_THRESHOLD` should also be updated.
Benchmark results (bytes per operation in the header and figures are ns per operation):
0 1 2 3 4 5 6 7 8 16 21 32 64
Linux a64 3.4 40.5 40.0 40.0 39.0 40.0 41.0 40.0 39.0 39.0 41.0 39.0 40.0
Linux x64 2.2 5.7 4.9 5.7 4.6 5.7 6.6 5.7 4.5 4.9 5.7 6.2 4.8
macOS a64 1.6 52.0 52.0 54.0 52.0 55.0 54.0 55.0 53.0 53.0 56.0 54.0 53.0
macOS x64 2.0 7.2 5.2 7.1 4.9 6.9 5.5 6.9 4.2 4.5 8.6 5.0 5.2
Windows x64 1.9 4.5 3.9 4.5 3.5 5.4 5.1 4.5 3.4 3.7 4.5 4.8 3.7
@BenchmarkMode(Mode.AverageTime)
@Warmup(iterations = 5, time = 500, timeUnit = TimeUnit.MILLISECONDS)
@Measurement(iterations = 10, time = 500, timeUnit = TimeUnit.MILLISECONDS)
@State(Scope.Thread)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@Fork(value = 3)
public class TestFill {
@Param({"0", "1", "2", "3", "4", "5", "6", "7", "8", "16", "32", "64", "128"})
public int ELEM_SIZE;
MemorySegment heapSegment;
@Setup
public void setup() {
var array = new byte[ELEM_SIZE];
heapSegment = MemorySegment.ofArray(array);
}
@Benchmark
public void heap_segment_fill() {
heapSegment.fill((byte) 0);
}
}
Once this has been fixed, `AbstractMemorySegmentImpl.java.FILL_NATIVE_THRESHOLD` should also be updated.
Benchmark results (bytes per operation in the header and figures are ns per operation):
0 1 2 3 4 5 6 7 8 16 21 32 64
Linux a64 3.4 40.5 40.0 40.0 39.0 40.0 41.0 40.0 39.0 39.0 41.0 39.0 40.0
Linux x64 2.2 5.7 4.9 5.7 4.6 5.7 6.6 5.7 4.5 4.9 5.7 6.2 4.8
macOS a64 1.6 52.0 52.0 54.0 52.0 55.0 54.0 55.0 53.0 53.0 56.0 54.0 53.0
macOS x64 2.0 7.2 5.2 7.1 4.9 6.9 5.5 6.9 4.2 4.5 8.6 5.0 5.2
Windows x64 1.9 4.5 3.9 4.5 3.5 5.4 5.1 4.5 3.4 3.7 4.5 4.8 3.7
@BenchmarkMode(Mode.AverageTime)
@Warmup(iterations = 5, time = 500, timeUnit = TimeUnit.MILLISECONDS)
@Measurement(iterations = 10, time = 500, timeUnit = TimeUnit.MILLISECONDS)
@State(Scope.Thread)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@Fork(value = 3)
public class TestFill {
@Param({"0", "1", "2", "3", "4", "5", "6", "7", "8", "16", "32", "64", "128"})
public int ELEM_SIZE;
MemorySegment heapSegment;
@Setup
public void setup() {
var array = new byte[ELEM_SIZE];
heapSegment = MemorySegment.ofArray(array);
}
@Benchmark
public void heap_segment_fill() {
heapSegment.fill((byte) 0);
}
}
- duplicates
-
JDK-8338967 Improve performance for MemorySegment::fill
- Resolved
- relates to
-
JDK-8338967 Improve performance for MemorySegment::fill
- Resolved
-
JDK-8339917 Implement Unsafe.setMemory intrinsic for Aarch64
- Closed