Shenandoah: Switch nmethod entry barriers to conc_instruction_and_data_patch

XMLWordPrintable

    • gc
    • b15

        Looking at Renaissance benchmarks, I notice that some benchmarks like scala-doku are significantly slower with Shenandoah in comparison with other collectors:

        $ build/linux-aarch64-server-release/images/jdk/bin/java -jar ~/renaissance-jmh-0.16.0.jar ScalaDoku -wi 3 -i 3 -f 1 --jvmArgs "-Xmx8g -Xms8g -XX:+AlwaysPreTouch -XX:+UseParallelGC"

        Benchmark Mode Cnt Score Error Units
        JmhScalaDoku.run ss 3 2160.655 ± 364.043 ms/op

        $ build/linux-aarch64-server-release/images/jdk/bin/java -jar ~/renaissance-jmh-0.16.0.jar ScalaDoku -wi 3 -i 3 -f 1 --jvmArgs "-Xmx8g -Xms8g -XX:+AlwaysPreTouch -XX:+UseShenandoahGC"

        Benchmark Mode Cnt Score Error Units
        JmhScalaDoku.run ss 3 3843.770 ± 740.348 ms/op


        perfasm shows the hotspot is in the "dmb ishld" in nmethod entry barrier.

        ....[Hottest Region 2]..............................................................................
        c2, scala.collection.immutable.SetIterator::next, version 1, compile id 893

                      0x0000ffffa8599008: nop
                    [Entry Point]
                      # {method} {0x0000fffe789eb008} 'next' '()Ljava/lang/Object;' in 'scala/collection/immutable/SetIterator'
                      # [sp+0x30] (sp of caller)
                      0x0000ffffa859900c: ldr w8, [x1, #8]
                      0x0000ffffa8599010: ldr w10, [x9, #8]
                      0x0000ffffa8599014: cmp w8, w10
                  ╭ 0x0000ffffa8599018: b.eq 0x0000ffffa8599020 // b.none
                  │ 0x0000ffffa859901c: b 0x0000ffffa848ec60 ; {runtime_call Shared Runtime ic_miss_blob}
                  │ [Verified Entry Point]
           0.19% ↘ 0x0000ffffa8599020: sub x9, sp, #0x14, lsl #12
           0.09% 0x0000ffffa8599024: str xzr, [x9]
           0.09% 0x0000ffffa8599028: sub sp, sp, #0x30
           0.09% 0x0000ffffa859902c: stp x29, x30, [sp, #32]
           0.09% 0x0000ffffa8599030: ldr w8, 0x0000ffffa8599174
           0.07% 0x0000ffffa8599034: dmb ishld
           7.22% 0x0000ffffa8599038: ldr w9, [x28, #32]
           0.08% 0x0000ffffa859903c: cmp w8, w9


        It makes sense that it affects some benchmarks that are not as deeply inlined. Erik did JDK-8290700, which ported a new way to sync up nmethod barriers, conc_instruction_and_data_patch, from Generational ZGC repository into mainline. Generational ZGC have been using it since JDK 21. Switching Shenandoah to it like so:

        diff --git a/src/hotspot/cpu/aarch64/gc/shenandoah/shenandoahBarrierSetAssembler_aarch64.hpp b/src/hotspot/cpu/aarch64/gc/shenandoah/shenandoahBarrierSetAssembler_aarch64.hpp
        index a12d4e2beec..c89847b9d52 100644
        --- a/src/hotspot/cpu/aarch64/gc/shenandoah/shenandoahBarrierSetAssembler_aarch64.hpp
        +++ b/src/hotspot/cpu/aarch64/gc/shenandoah/shenandoahBarrierSetAssembler_aarch64.hpp
        @@ -67,7 +67,7 @@ class ShenandoahBarrierSetAssembler: public BarrierSetAssembler {
                                                 Register scratch, RegSet saved_regs);
         
         public:
        - virtual NMethodPatchingType nmethod_patching_type() { return NMethodPatchingType::conc_data_patch; }
        + virtual NMethodPatchingType nmethod_patching_type() { return NMethodPatchingType::conc_instruction_and_data_patch; }
         
         #ifdef COMPILER1
           void gen_pre_barrier_stub(LIR_Assembler* ce, ShenandoahPreBarrierStub* stub);

        ...makes Shenandoah perform on significantly better on this example workload:

        Benchmark Mode Cnt Score Error Units
        JmhScalaDoku.run ss 3 2616.273 ± 51.920 ms/op

        We need to see what else should be done to support conc_instruction_and_data_patch in Shenandoah barriers.

              Assignee:
              Cesar Soares
              Reporter:
              Aleksey Shipilev
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

                Created:
                Updated:
                Resolved: