Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8299821

RISC-V: Optimize zero_blocks and zero_words stubs

XMLWordPrintable

    • Icon: Enhancement Enhancement
    • Resolution: Duplicate
    • Icon: P4 P4
    • None
    • 21
    • hotspot
    • None
    • riscv
    • linux

      Currently zero_blocks ( if Zicboz is missing, which is very hard to fidn in h/w)
      generates this code:

                  StubRoutines::zero_blocks [0x0000003fb1033f00, 0x0000003fb1033f58] (88 bytes)

                   0x0000003fb1033f00: addi t4,t4,-8
         0.54% ? 0x0000003fb1033f04: bltz t4,Stub::zero_blocks+80 0x0000003fb1033f50
         0.65% ?? 0x0000003fb1033f08: sd zero,0(t3)
         5.88% ?? 0x0000003fb1033f0c: addi t3,t3,8
         0.73% ?? 0x0000003fb1033f10: sd zero,0(t3)
         3.72% ?? 0x0000003fb1033f14: addi t3,t3,8
         1.83% ?? 0x0000003fb1033f18: sd zero,0(t3)
         4.47% ?? 0x0000003fb1033f1c: addi t3,t3,8
         1.31% ?? 0x0000003fb1033f20: sd zero,0(t3)
        24.18% ?? 0x0000003fb1033f24: addi t3,t3,8
         1.43% ?? 0x0000003fb1033f28: sd zero,0(t3)
        16.56% ?? 0x0000003fb1033f2c: addi t3,t3,8
         1.97% ?? 0x0000003fb1033f30: sd zero,0(t3)
         3.44% ?? 0x0000003fb1033f34: addi t3,t3,8
         1.46% ?? 0x0000003fb1033f38: sd zero,0(t3)
         3.33% ?? 0x0000003fb1033f3c: addi t3,t3,8
         1.68% ?? 0x0000003fb1033f40: sd zero,0(t3)
         2.46% ?? 0x0000003fb1033f44: addi t3,t3,8
         2.00% ?? 0x0000003fb1033f48: addi t4,t4,-8
         0.85% ?? 0x0000003fb1033f4c: bgez t4,Stub::zero_blocks+8 0x0000003fb1033f08
         0.04% ? 0x0000003fb1033f50: addi t4,t4,8
         0.55% 0x0000003fb1033f54: ret

        It can be optimized to produce this code and reduce code size, also reducing inter-op deps

             StubRoutines::zero_blocks [0x0000003fa8e88b00, 0x0000003fa8e88b3c] (60 bytes)

                    0x0000003fa8e88b00: addi t4,t4,-8
         0.50% ? 0x0000003fa8e88b04: bltz t4,Stub::zero_blocks+52 0x0000003fa8e88b34
         0.82% ?? 0x0000003fa8e88b08: sd zero,0(t3)
         5.50% ?? 0x0000003fa8e88b0c: sd zero,8(t3)
         9.43% ?? 0x0000003fa8e88b10: sd zero,16(t3)
         4.96% ?? 0x0000003fa8e88b14: sd zero,24(t3)
        21.03% ?? 0x0000003fa8e88b18: sd zero,32(t3)
        20.00% ?? 0x0000003fa8e88b1c: sd zero,40(t3)
         6.57% ?? 0x0000003fa8e88b20: sd zero,48(t3)
         3.91% ?? 0x0000003fa8e88b24: sd zero,56(t3)
         4.37% ?? 0x0000003fa8e88b28: addi t3,t3,64
         0.32% ?? 0x0000003fa8e88b2c: addi t4,t4,-8
         0.88% ?? 0x0000003fa8e88b30: bgez t4,Stub::zero_blocks+8 0x0000003fa8e88b08
         0.03% ? 0x0000003fa8e88b34: addi t4,t4,8
         0.50% 0x0000003fa8e88b38: ret


      store_words can also be improved in the similar way, from:

       0.07% 0x0000003fa940a8dc: andi t0,t4,4
         1.24% 0x0000003fa940a8e0: beqz t0,0x0000003fa940a904
                  0x0000003fa940a8e4: sd zero,0(t3)
                  0x0000003fa940a8e8: addi t3,t3,8
                  0x0000003fa940a8ec: sd zero,0(t3)
                  0x0000003fa940a8f0: addi t3,t3,8
                  0x0000003fa940a8f4: sd zero,0(t3)
                  0x0000003fa940a8f8: addi t3,t3,8
                  0x0000003fa940a8fc: sd zero,0(t3)
                  0x0000003fa940a900: addi t3,t3,8
         0.58% 0x0000003fa940a904: andi t0,t4,2
                  0x0000003fa940a908: beqz t0,0x0000003fa940a91c
                  0x0000003fa940a90c: sd zero,0(t3)
                  0x0000003fa940a910: addi t3,t3,8
                  0x0000003fa940a914: sd zero,0(t3)
                  0x0000003fa940a918: addi t3,t3,8
                  0x0000003fa940a91c: andi t0,t4,1
         0.17% 0x0000003fa940a920: beqz t0,0x0000003fa940a928
                  0x0000003fa940a924: sd zero,0(t3)


      to:

        0.27% 0x0000003fd9407acc: andi t0,t4,4
         1.06% 0x0000003fd9407ad0: beqz t0,0x0000003fd9407ae8
                  0x0000003fd9407ad4: sd zero,0(t3)
                  0x0000003fd9407ad8: sd zero,8(t3)
                  0x0000003fd9407adc: sd zero,16(t3)
                  0x0000003fd9407ae0: sd zero,24(t3)
                  0x0000003fd9407ae4: addi t3,t3,32
         0.53% 0x0000003fd9407ae8: andi t0,t4,2
                  0x0000003fd9407aec: beqz t0,0x0000003fd9407afc
                  0x0000003fd9407af0: sd zero,0(t3)
                  0x0000003fd9407af4: sd zero,8(t3)
                  0x0000003fd9407af8: addi t3,t3,16
                  0x0000003fd9407afc: andi t0,t4,1
         0.15% 0x0000003fd9407b00: beqz t0,0x0000003fd9407b08
                  0x0000003fd9407b04: sd zero,0(t3)

            vkempik Vladimir Kempik
            vkempik Vladimir Kempik
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: