-
Bug
-
Resolution: Fixed
-
P3
-
24
-
master
-
riscv
-
linux
When running e.g. chi-square a large performance regression can be seen on some hardware (in this case P550).
These renaissance benchmarks are highly compiler dependent, meaning result can vary with 30% run to run due to differences in code cache (both due to profiling and due to placement of code).
One major factor is that pre-24 rv64 used trampoline calls:
##############
0x00007ff43025ee8c: jal ra,0x00007ff43025f16c // if target reachable we did a direct call here, otherwise via tramopline
...
0x00007ff43025f16c: auipc t1,0x0 ; {trampoline_stub}
0x00007ff43025f170: ld t1,12(t1) # 0x00007ff43025f178
0x00007ff43025f174: jalr zero,0(t1)
0x00007ff43025f178: <8-byte address> // atomically patchable
#################
Due to issues with loading intra-cache and an unneeded jump this was change in: "8332689: RISC-V: Use load instead of trampolines"
##################
0x00007ff3b4342c30: auipc t1,0x0
0x00007ff3b4342c34: ld t1,832(t1) # 0x00007ff3b4342f70
0x00007ff3b4342c38: jalr ra,0(t1)
...
0x00007ff3b4342f70: <8-byte address> // atomically patchable
#################
But this implementation didn't have direct calls, as they in practice are rare.
These renaissance benchmarks are highly compiler dependent, meaning result can vary with 30% run to run due to differences in code cache (both due to profiling and due to placement of code).
One major factor is that pre-24 rv64 used trampoline calls:
##############
0x00007ff43025ee8c: jal ra,0x00007ff43025f16c // if target reachable we did a direct call here, otherwise via tramopline
...
0x00007ff43025f16c: auipc t1,0x0 ; {trampoline_stub}
0x00007ff43025f170: ld t1,12(t1) # 0x00007ff43025f178
0x00007ff43025f174: jalr zero,0(t1)
0x00007ff43025f178: <8-byte address> // atomically patchable
#################
Due to issues with loading intra-cache and an unneeded jump this was change in: "8332689: RISC-V: Use load instead of trampolines"
##################
0x00007ff3b4342c30: auipc t1,0x0
0x00007ff3b4342c34: ld t1,832(t1) # 0x00007ff3b4342f70
0x00007ff3b4342c38: jalr ra,0(t1)
...
0x00007ff3b4342f70: <8-byte address> // atomically patchable
#################
But this implementation didn't have direct calls, as they in practice are rare.
- causes
-
JDK-8367501 RISC-V: build broken after JDK-8365926
-
- Resolved
-
- relates to
-
JDK-8367402 RISC-V: philosophers (renaissance) benchmark investigation
-
- Open
-
-
JDK-8332689 RISC-V: Use load instead of trampolines
-
- Resolved
-
-
JDK-8343430 RISC-V: C2: Remove old trampoline call
-
- Resolved
-
- links to
-
Commit(master) openjdk/jdk/5c1865a4
-
Review(master) openjdk/jdk/26944
(1 links to)