Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8296350

RISC-V: RVC crashes on some machines with old opensbi libs

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Won't Fix
    • Icon: P3 P3
    • None
    • 19, 20
    • hotspot
    • None
    • 19
    • riscv
    • linux

      ## <Problem>
      We have encountered two uncommon cases of crashes when RVC is enabled on some RISC-V boards bearing old opensbi libs. After updating ubuntu/opensbi, the problem's gone. We think the issue is not caused by the implementation of the RVC extension in previous discussions[1][2].


      ## <To fix this>
      Users could update outdated opensbi lib on their boards to fix this issue, or disable RVC by using `-XX:-UseRVC` to workaround this.


      ## <Crash logs and cause>
      An example hs_err log (please see the inlined comment at pc address `0x0000003fa9732e22`):

      ```
      Registers:
      pc =0x0000003fbb56d640: <offset 0x000000000048a640> in /home/jenkins/workspace/riscv-openjdk/openjdk-17/lib/server/libjvm.so at 0x0000003fbb0e3000
      x1(ra) =0x0000003fa9732e26 is at code_begin+38 in
      [CodeBlob (0x0000003fa9732d90)]
      Framesize: 2
      UncommonTrapBlob
      --------------------------------------------------------------------------------
      Decoding CodeBlob, name: UncommonTrapBlob, at [0x0000003fa9732e00, 0x0000003fa9732ec0] 192 bytes
        0x0000003fa9732e00: addi sp,sp,-16
        0x0000003fa9732e02: sd ra,8(sp)
        0x0000003fa9732e04: sd s0,0(sp)
        0x0000003fa9732e06: sext.w a1,a1
        0x0000003fa9732e08: auipc t0,0x0
        0x0000003fa9732e0c: addi t0,t0,30 # 0x0000003fa9732e26
        0x0000003fa9732e10: sd t0,664(s7)
        0x0000003fa9732e14: mv t0,sp
        0x0000003fa9732e16: sd t0,656(s7)
        0x0000003fa9732e1a: mv a0,s7
        0x0000003fa9732e1c: li a2,2
        0x0000003fa9732e1e: auipc t0,0x11e3b
        0x0000003fa9732e22: jalr -2024(t0) # 0x0000003fbb56d636 = DeoptReasonSerializer::serialize(JfrCheckpointWriter&)+366 <--- This destination is certainly not right in an UncommonTrapBlob. The unaligned access is broken due to bugs in old libs and after patching to such unaligned instructions, they are triggered and the pc flies away.
        0x0000003fa9732e26: sd zero,656(s7)
        0x0000003fa9732e2a: sd zero,664(s7)
        0x0000003fa9732e2e: mv a4,a0
        0x0000003fa9732e30: addi sp,sp,16
        0x0000003fa9732e32: lwu a2,0(a4)
        0x0000003fa9732e36: addi a2,a2,-16
        0x0000003fa9732e38: add sp,sp,a2
        0x0000003fa9732e3a: ld s0,0(sp)
        0x0000003fa9732e3c: ld ra,8(sp)
        0x0000003fa9732e3e: addi sp,sp,16
        0x0000003fa9732e40: ld a2,24(a4)
        0x0000003fa9732e42: ld a5,16(a4)
      ```
      With RVC, instructions could become 2-byte aligned. Patching such instructions triggers unaligned memory accesses, we speculate this can further trigger some issues in underlying libs[1][2].
      Updating the outdated libs could solve this, with verifications from other RISC-V developers.


      [1] https://mail.openjdk.org/pipermail/riscv-port-dev/2022-September/000618.html
      [2] https://github.com/riscv-collab/riscv-openjdk/issues/23

            xlinzheng Xiaolin Zheng
            xlinzheng Xiaolin Zheng
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: