Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8350313

Include timings for leaving safepoint in safepoint logging

XMLWordPrintable

    • b12

      We have a production use case where leaving the safepoint introduced a significant latency, mostly due to VMThread getting de-scheduled. There are mitigations for this, see JDK-8350324. This issue is about capturing such an event in safepoint logging more comprehensively.

      In those cases we would see something like:

      [3.664s][info][gc ] GC(7) Pause Young (Allocation Failure) 277M->3M(989M) 5.145ms
      [3.694s][info][safepoint] Safepoint "SerialCollectForAllocation", Time since last: 374870126 ns, Reaching safepoint: 271019 ns, At safepoint: 35859436 ns, Total: 36130455 ns

      ...which is extremely confusing: the GC itself took ~5ms, yet "At safepoint" mentions 35ms! All that time is actually spent disarming the WaitBarrier, and eating the de/re-scheduling for VMThread at the same time.

      I think a "Leaving safepoint" counter would be good to have. It comes with the symmetry advantage with "Reaching safepoint" counter. "Leaving safepoint" measures the time spent in safepoint machinery from the "finishing side". And, it more clearly captures the transitional state where some threads might be still at safepoint, and some have already unparked, like "Reaching safepoint".

      So the hypothetical log for the excerpt above would look something like:

      [3.664s][info][gc ] GC(7) Pause Young (Allocation Failure) 277M->3M(989M) 5.145ms
      [3.694s][info][safepoint] Safepoint "SerialCollectForAllocation", Time since last: 374870126 ns, Reaching safepoint: 271019 ns, At safepoint: 5859436 ns, Leaving safepoint: 30000000 ns, Total: 36130455 ns

            xpeng Xiaolong Peng
            shade Aleksey Shipilev
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: