Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8325587

Shenandoah: ShenandoahLock should allow blocking in VM

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • P3
    • 23
    • 17, 21, 22, 23
    • hotspot
    • gc
    • b11

    Backports

      Description

        On a sample run (log attached), the time to perform HandshakeForDeflation is observed to require more than 193 seconds. Similar very long handshake times are observed approximately once every 4 times I run this particular workload. The host has 16 vCPUs.
        ```
        [921.732s][info ][handshake ] Handshake "HandshakeForDeflation", Targeted threads: 2017, Executed by requesting thread: 16, Total completion time: 193294197991 ns
        ```

        The reproducer is a specific configuration of the public Extremem GC benchmark, which can be downloaded from https://github.com/corretto/heapothesys
        ```
        echo Run TradiShen tip with memory size 26g with 4s customer period
        >&2 echo Run TradiShen tip with memory size 26g with 4s customer period
        ~/github/jdk.2-1-2024/build/linux-x86_64-server-release/jdk/bin/java \
          -XX:+UnlockExperimentalVMOptions \
          -XX:+UseTransparentHugePages \
          -XX:-ShenandoahPacing \
          -XX:+AlwaysPreTouch -XX:+DisableExplicitGC -Xms26g -Xmx26g \
          -XX:+UseShenandoahGC \
          -Xlog:"gc*=info,ergo" \
          -Xlog:vmthread=trace -Xlog:handshake*=debug \
          -Xlog:safepoint=trace -Xlog:safepoint=debug -Xlog:safepoint=info \
          -XX:+UnlockDiagnosticVMOptions \
          -jar ~/github/heapothesys/Extremem/target/extremem-1.0-SNAPSHOT.jar \
          -dInitializationDelay=45s -dDictionarySize=16000000 -dNumCustomers=28000000 \
          -dNumProducts=64000 -dCustomerThreads=2000 -dCustomerPeriod=4s -dCustomerThinkTime=1s \
          -dKeywordSearchCount=4 -dServerThreads=5 -dServerPeriod=5s -dProductNameLength=10 \
          -dBrowsingHistoryQueueCount=5 \
          -dSalesTransactionQueueCount=5 \
          -dProductDescriptionLength=64 -dProductReplacementPeriod=25s -dProductReplacementCount=5 \
          -dCustomerReplacementPeriod=30s -dCustomerReplacementCount=1000 -dBrowsingExpiration=1m \
          -dPhasedUpdates=true \
          -dPhasedUpdateInterval=60s \
          -dSimulationDuration=20m -dResponseTimeMeasurements=100000
        ```
        An annotated log of an instrumented execution shows that the outer loop of VM_handshakeAllThreads::doit() executes 186 times. The last iteration finally succeeds to perform:
        ```
        [921.726s][debug][handshake,task ] Operation: HandshakeForDeflation for thread 0x00007f24bc104630, is_vm_thread: true, completed in 44 ns
        ```
        on a thread that had identified as _not_safe on previous iteration.

        I am not yet sufficiently confident to conclude whether this a problem with the implementation of doit(), or is the fault of the HandshakeForDeflation implementation which must be self-identifying as _not_safe for too long.

        Attachments

          Issue Links

            Activity

              People

                shade Aleksey Shipilev
                kdnilsen Kelvin Nilsen
                Votes:
                0 Vote for this issue
                Watchers:
                6 Start watching this issue

                Dates

                  Created:
                  Updated:
                  Resolved: