-
Bug
-
Resolution: Fixed
-
P3
-
19
Issue | Fix Version | Assignee | Priority | Status | Resolution | Resolved In Build |
---|---|---|---|---|---|---|
JDK-8311699 | 17.0.9 | Fei Yang | P3 | Resolved | Fixed | b01 |
MacroAssembler::movptr() is designed to load a 47-bit (unsigned) address constant, ranging [0x0, 0x7FFF_FFFF_FFFF], and a special case -1 (the Universe::non_oop_word() as we know, which is 0xFFFF_FFFF_FFFF_FFFF). The former ones are inside a sv48 address space range[1]. Please note that under sv48 a valid address has the bit 47 equal to 0 in user space, so that MacroAssembler::movptr() could handle all cases under sv48. However, when loading an immediate value ranging [0x7FFF_8000_0000, 0x7FFF_FFFF_FFFF] using it, the results would wrongly become [0xFFFF_7FFF_8000_0000, 0xFFFF_7FFF_FFFF_FFFF], which indicates the MSB has polluted high bits in rare cases.
MacroAssembler::movptr() is a composition of `lui+addi+slli+addi+slli+addi`, and all of them are signed operations, MIPS alike.
Precisely, the first `lui+addi` aims to load the first 32-bit; then the `slli+addi` would load the 11-bit; finally the last `slli+addi` is going to load the remaining 5-bit.
To deal with this, there are two approaches:
(a) Using an addiw to replace the first addi. addiw has nearly the same semantics as addi, but after the operation the result would be sign-extended according to the bit 31. Due to this feature, we could use this to clean up the dirty high bits at all times. This could also handle the (-1) case. However, Assembler::li32(), which is composed of `lui+addiw`, will conflict with the new implementation, needing further adaptations. (Personally I a bit dislike of that)
(b) Alike V8's implementation [2], the trick here is it loads only the first 31-bit using `lui+addi`, with a leading 0 as the bit 31. So this one could prevent this issue at the beginning. As a trade-off, we need to shift one another bit because the leading 0 occupies one bit. Also this one could also handle the (-1) case as well after minor adaptations. (I like this one)
This problem could be reproduced using `-XX:CompressedClassSpaceBaseAddress=0x7FFFF8000000 -XX:CompressedClassSpaceSize=40M -Xshare:off` with fastdebug build, and on Qemu only, for currently I have no access to hardware that supports sv48, and the kernel Ubuntu[3] relies on is Linux 5.15. The kernel (TIP) would first check if hardware sponsors sv57, if not then fall back to sv48, and so on. It is not until Linux 5.17 that sv48 is supported[4]. So this issue could never be reproduced on my boards. But fortunately Qemu could sponsor this, because one could mmap an address in 48-bit address space even in a user-level Qemu.
Tested with `-XX:CompressedClassSpaceBaseAddress=0x7FFFF8000000 -XX:CompressedClassSpaceSize=40M -Xshare:off` (reproducible) on Qemu with hotspot tier1 (we should ignore OOM caused the compressed class space), and other tiers are on the way.
Testing sanity hotspot tier1~tier4 (could not reproduce). Tier1 is finished without new failures.
[1] https://github.com/riscv/riscv-isa-manual/blob/9ec8c0105dbf1492b57f6cafdb90a268628f476a/src/supervisor.tex#L1999-L2006
[2] https://github.com/v8/v8/blob/main/src/codegen/riscv64/assembler-riscv64.cc#L3479-L3495
[3] https://cdimage.ubuntu.com/releases/22.04/release/
[4] https://www.phoronix.com/scan.php?page=news_item&px=Linux-5.17-RISC-V-sv48
MacroAssembler::movptr() is a composition of `lui+addi+slli+addi+slli+addi`, and all of them are signed operations, MIPS alike.
Precisely, the first `lui+addi` aims to load the first 32-bit; then the `slli+addi` would load the 11-bit; finally the last `slli+addi` is going to load the remaining 5-bit.
To deal with this, there are two approaches:
(a) Using an addiw to replace the first addi. addiw has nearly the same semantics as addi, but after the operation the result would be sign-extended according to the bit 31. Due to this feature, we could use this to clean up the dirty high bits at all times. This could also handle the (-1) case. However, Assembler::li32(), which is composed of `lui+addiw`, will conflict with the new implementation, needing further adaptations. (Personally I a bit dislike of that)
(b) Alike V8's implementation [2], the trick here is it loads only the first 31-bit using `lui+addi`, with a leading 0 as the bit 31. So this one could prevent this issue at the beginning. As a trade-off, we need to shift one another bit because the leading 0 occupies one bit. Also this one could also handle the (-1) case as well after minor adaptations. (I like this one)
This problem could be reproduced using `-XX:CompressedClassSpaceBaseAddress=0x7FFFF8000000 -XX:CompressedClassSpaceSize=40M -Xshare:off` with fastdebug build, and on Qemu only, for currently I have no access to hardware that supports sv48, and the kernel Ubuntu[3] relies on is Linux 5.15. The kernel (TIP) would first check if hardware sponsors sv57, if not then fall back to sv48, and so on. It is not until Linux 5.17 that sv48 is supported[4]. So this issue could never be reproduced on my boards. But fortunately Qemu could sponsor this, because one could mmap an address in 48-bit address space even in a user-level Qemu.
Tested with `-XX:CompressedClassSpaceBaseAddress=0x7FFFF8000000 -XX:CompressedClassSpaceSize=40M -Xshare:off` (reproducible) on Qemu with hotspot tier1 (we should ignore OOM caused the compressed class space), and other tiers are on the way.
Testing sanity hotspot tier1~tier4 (could not reproduce). Tier1 is finished without new failures.
[1] https://github.com/riscv/riscv-isa-manual/blob/9ec8c0105dbf1492b57f6cafdb90a268628f476a/src/supervisor.tex#L1999-L2006
[2] https://github.com/v8/v8/blob/main/src/codegen/riscv64/assembler-riscv64.cc#L3479-L3495
[3] https://cdimage.ubuntu.com/releases/22.04/release/
[4] https://www.phoronix.com/scan.php?page=news_item&px=Linux-5.17-RISC-V-sv48
- backported by
-
JDK-8311699 riscv: Fix correctness issue of MacroAssembler::movptr
- Resolved
- links to
-
Commit openjdk/jdk17u-dev/966fc82d
-
Commit openjdk/jdk/447ae006
-
Commit openjdk/riscv-port-jdk11u/309291f1
-
Review openjdk/jdk17u-dev/1427
-
Review openjdk/jdk/8913
-
Review openjdk/riscv-port-jdk11u/3
(2 links to)