-
Bug
-
Resolution: Unresolved
-
P4
-
25
-
aarch64
The problem can be reproduced with the following simple program:
public class Main {
public static void main(String[] args) {
StringBuilder sb = new StringBuilder();
sb.append(new char[1_000_000]);
for (int i = 0; i < 1_000_000; i++) {
sb.append(12345);
sb.delete(0, 5);
}
}
}
Most of CPU time here is spent in `System.arraycopy` under `sb.delete`, however, profilers (perf, async-profiler) show something else due to missed PcDesc mapping, see PrintAssembly:
0x0000ffff82cfb0fc: cbz w16, 0x0000ffff82cfafa8
0x0000ffff82cfb100: add x0, x18, #0x10
0x0000ffff82cfb104: add x1, x13, #0x10
0x0000ffff82cfb108: cmp x2, #0x20
0x0000ffff82cfb10c: b.ls 0x0000ffff82cfaf9c // b.plast
0x0000ffff82cfb110: mov x21, x17
0x0000ffff82cfb114: bl Stub::Stub Generator jbyte_arraycopy_stub
; {runtime_call Stub::Stub Generator jbyte_arraycopy_stub}
0x0000ffff82cfb118: nop ; {post_call_nop}
0x0000ffff82cfb11c: movk xzr, #0x0
0x0000ffff82cfb120: movk xzr, #0x0
0x0000ffff82cfb124: ptrue p7.b
0x0000ffff82cfb128: mov x17, x21
0x0000ffff82cfb12c: b 0x0000ffff82cfafa8
0x0000ffff82cfb130: str w12, [sp, #32]
0x0000ffff82cfb134: str w15, [sp, #12]
0x0000ffff82cfb138: stp x17, x13, [sp, #16]
0x0000ffff82cfb13c: str w20, [sp, #8]
0x0000ffff82cfb140: str x19, [sp] ;*synchronization entry
; - java.lang.AbstractStringBuilder::ensureCapacitySameCoder@-1 (line 307)
; - java.lang.AbstractStringBuilder::append@25 (line 925)
; - java.lang.StringBuilder::append@2 (line 253)
; - Main::main@29 (line 7)
A similar effect is observed with some JDK microbenchmarks such as org.openjdk.bench.vm.gc.Alloc.testLargeConstArray:
java -XX:+UnlockDiagnosticVMOptions -XX:+PrintAssembly -jar benchmarks.jar -f 1 -wi 1 -i 10 -bm avgt -tu ns -rf json -rff results.json org.openjdk.bench.vm.gc.Alloc.testLargeVariableArray$
Stub::zero_blocks at address 0x0000ffff6c479528 will be incorrectly attributed to Blackhole::consume, even though Blackhole::consume does not allocate anything.
0x0000ffff6c479514: add x10, x0, #0x10
0x0000ffff6c479518: str w12, [x0, #12]
0x0000ffff6c47951c: mov x11, x16
0x0000ffff6c479520: subs x8, x11, #0x8
0x0000ffff6c479524: b.cc 0x0000ffff6c47952c // b.lo, b.ul, b.last
0x0000ffff6c479528: bl Stub::zero_blocks ; {runtime_call StubRoutines (final stubs)}
0x0000ffff6c47952c: tbz w11, #2, 0x0000ffff6c479538
0x0000ffff6c479530: stp xzr, xzr, [x10], #16
0x0000ffff6c479534: stp xzr, xzr, [x10], #16
0x0000ffff6c479538: tbz w11, #1, 0x0000ffff6c479540
0x0000ffff6c47953c: stp xzr, xzr, [x10], #16
0x0000ffff6c479540: tbz w11, #0, 0x0000ffff6c479548
0x0000ffff6c479544: str xzr, [x10]
0x0000ffff6c479548: dmb ishst ;*invokestatic consumeCompiler {reexecute=0 rethrow=0 return_oop=0}
; - org.openjdk.jmh.infra.Blackhole::consume@7 (line 323)
; - org.openjdk.bench.vm.gc.Alloc::testLargeVariableArray@23 (line 60)
; - org.openjdk.bench.vm.gc.jmh_generated.Alloc_testLargeVariableArray_jmhTest::testLargeVariableArray_avgt_jmhStub@17 (line 190)
The issue happens with all JDKs from 11 to 26 on AArch64 only.
----
The bug is submitted on behalf of @Baraa-Hasheesh who originally investigated the issue reported at https://github.com/async-profiler/async-profiler/issues/1505
public class Main {
public static void main(String[] args) {
StringBuilder sb = new StringBuilder();
sb.append(new char[1_000_000]);
for (int i = 0; i < 1_000_000; i++) {
sb.append(12345);
sb.delete(0, 5);
}
}
}
Most of CPU time here is spent in `System.arraycopy` under `sb.delete`, however, profilers (perf, async-profiler) show something else due to missed PcDesc mapping, see PrintAssembly:
0x0000ffff82cfb0fc: cbz w16, 0x0000ffff82cfafa8
0x0000ffff82cfb100: add x0, x18, #0x10
0x0000ffff82cfb104: add x1, x13, #0x10
0x0000ffff82cfb108: cmp x2, #0x20
0x0000ffff82cfb10c: b.ls 0x0000ffff82cfaf9c // b.plast
0x0000ffff82cfb110: mov x21, x17
0x0000ffff82cfb114: bl Stub::Stub Generator jbyte_arraycopy_stub
; {runtime_call Stub::Stub Generator jbyte_arraycopy_stub}
0x0000ffff82cfb118: nop ; {post_call_nop}
0x0000ffff82cfb11c: movk xzr, #0x0
0x0000ffff82cfb120: movk xzr, #0x0
0x0000ffff82cfb124: ptrue p7.b
0x0000ffff82cfb128: mov x17, x21
0x0000ffff82cfb12c: b 0x0000ffff82cfafa8
0x0000ffff82cfb130: str w12, [sp, #32]
0x0000ffff82cfb134: str w15, [sp, #12]
0x0000ffff82cfb138: stp x17, x13, [sp, #16]
0x0000ffff82cfb13c: str w20, [sp, #8]
0x0000ffff82cfb140: str x19, [sp] ;*synchronization entry
; - java.lang.AbstractStringBuilder::ensureCapacitySameCoder@-1 (line 307)
; - java.lang.AbstractStringBuilder::append@25 (line 925)
; - java.lang.StringBuilder::append@2 (line 253)
; - Main::main@29 (line 7)
A similar effect is observed with some JDK microbenchmarks such as org.openjdk.bench.vm.gc.Alloc.testLargeConstArray:
java -XX:+UnlockDiagnosticVMOptions -XX:+PrintAssembly -jar benchmarks.jar -f 1 -wi 1 -i 10 -bm avgt -tu ns -rf json -rff results.json org.openjdk.bench.vm.gc.Alloc.testLargeVariableArray$
Stub::zero_blocks at address 0x0000ffff6c479528 will be incorrectly attributed to Blackhole::consume, even though Blackhole::consume does not allocate anything.
0x0000ffff6c479514: add x10, x0, #0x10
0x0000ffff6c479518: str w12, [x0, #12]
0x0000ffff6c47951c: mov x11, x16
0x0000ffff6c479520: subs x8, x11, #0x8
0x0000ffff6c479524: b.cc 0x0000ffff6c47952c // b.lo, b.ul, b.last
0x0000ffff6c479528: bl Stub::zero_blocks ; {runtime_call StubRoutines (final stubs)}
0x0000ffff6c47952c: tbz w11, #2, 0x0000ffff6c479538
0x0000ffff6c479530: stp xzr, xzr, [x10], #16
0x0000ffff6c479534: stp xzr, xzr, [x10], #16
0x0000ffff6c479538: tbz w11, #1, 0x0000ffff6c479540
0x0000ffff6c47953c: stp xzr, xzr, [x10], #16
0x0000ffff6c479540: tbz w11, #0, 0x0000ffff6c479548
0x0000ffff6c479544: str xzr, [x10]
0x0000ffff6c479548: dmb ishst ;*invokestatic consumeCompiler {reexecute=0 rethrow=0 return_oop=0}
; - org.openjdk.jmh.infra.Blackhole::consume@7 (line 323)
; - org.openjdk.bench.vm.gc.Alloc::testLargeVariableArray@23 (line 60)
; - org.openjdk.bench.vm.gc.jmh_generated.Alloc_testLargeVariableArray_jmhTest::testLargeVariableArray_avgt_jmhStub@17 (line 190)
The issue happens with all JDKs from 11 to 26 on AArch64 only.
----
The bug is submitted on behalf of @Baraa-Hasheesh who originally investigated the issue reported at https://github.com/async-profiler/async-profiler/issues/1505
- relates to
-
JDK-8281677 Invalid stack traces in JFR when approximating due to a lack of debug information in compiled code
-
- In Progress
-