Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8299162

Refactor shared trampoline emission logic

XMLWordPrintable

    • Icon: Enhancement Enhancement
    • Resolution: Fixed
    • Icon: P4 P4
    • 21
    • 21
    • hotspot
    • None
    • 21
    • b09
    • aarch64, riscv
    • linux

      After the quick fix [JDK-8297763](https://bugs.openjdk.org/browse/JDK-8297763), shared trampoline logic gets a bit verbose. If we can turn to batch emission of trampoline stubs, pre-calculating the total size, and pre-allocating them, then we can remove the CodeBuffer expansion checks each time and clean up the code around.

      ```
      [Stub Code]
      ...
      <shared trampoline stub1, (A):>
        __ align() // emit nothing or a 4-byte padding
                    <-- (B) multiple relocations at the pc: __ relocate(<the pc here>, trampoline_stub_Relocation::spec())
        __ ldr()
        __ br()
        __ emit_int64()
      <shared trampoline stub2, (C):>
        __ align() // emit nothing or a 4-byte padding
                    <-- multiple relocations at the pc: __ relocate(<the pc here>, trampoline_stub_Relocation::spec())
        __ ldr()
        __ br()
        __ emit_int64()
      <shared trampoline stub3:>
        __ align() // emit nothing or a 4-byte padding
                    <-- multiple relocations at the pc: __ relocate(<the pc here>, trampoline_stub_Relocation::spec())
        __ ldr()
        __ br()
        __ emit_int64()
      ```

      Here, the `pc_at_(C) - pc_at_(B)` is the fixed length `NativeCallTrampolineStub::instruction_size`; but the `pc_at_(B) - pc_at_(A)` may be a 0 or 4, which is not a fixed-length value.

      So Originally:
      The logic of the lambda `emit` inside the `emit_shared_trampolines()` when emitting a shared trampoline:
      ```
      We are at (A) ->
      do an align() ->
      We are at (B) ->
      emit lots of relocations bound to this shared trampoline at (B) ->
      do an emit_trampoline_stub() ->
      We are at (C)
      ```

      After this patch:
      ```
      We are at (A) ->
      do an emit_trampoline_stub(), which contains an align() already ->
      We are at (C) directly ->
      reversely calculate the (B) address, for `pc_at_(C) - pc_at_(B)` is a fixed-length value ->
      emit lots of relocations bound to this shared trampoline at (B)
      ```

      Theoretically the same. Just a code refactoring and we can remove some checks inside and make the code clean.

      Tested AArch64 hotspot tier1\~4 with fastdebug build twice; Tested RISC-V hotspot tier1\~4 with fastdebug build on hardware once.

            xlinzheng Xiaolin Zheng
            xlinzheng Xiaolin Zheng
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: