Loading...

Type: Bug
Resolution: Fixed
Priority: P2
Fix Version/s: 12
Affects Version/s: 8u45, 9, 10, 11
Component/s: hotspot
Labels:

Subcomponent:
compiler
Introduced In Version:

8
Resolved In Build:
b13
CPU:

generic
OS:

generic

Issue	Fix Version	Assignee	Priority	Status	Resolution	Resolved In Build
JDK-8243397	11.0.8-oracle	Roman Kennke	P2	Closed	Fixed	b04
JDK-8304788	8-pool	Daniel Skantz	P2	Closed	Won't Fix

One of our services that runs with a few thousand threads recently noticed an increase in safepoint stop times, but not gc times, upon transitioning to JDK 8 from JDK 7.

Further investigation revealed that most of the delta was related to the so-called pre-gc/vm-op "cleanup" phase when various book-keeping activities are performed, and more specifically in the portion that walks java thread stacks single-threaded (!) and updates the hotness counters for the active nmethods. This code appears to
be new to JDK 8 (in jdk 7 one would walk the stacks only during code cache sweeps).

This code should be fixed by either:
(a) doing these updates by walking thread stacks in multiple worker threads in parallel, or best of all:
(b) doing these updates when we walk the thread stacks during GC, and skipping this phase entirely
      for non-GC safepoints (with attendant loss in frequency of this update in low GC frequency
      scenarios).

I also wonder how code cache sweep/eviction quality might be compromised if one were to do these hotness updates at a much lower frequency, as a temporary workaround to the performance problem.

This issue was discussed at some length on the hotspot-compiler-dev@openjdk.java.net mailing list in the thread starting at:

       http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2015-July/018618.html

where further detail can be obtained.

The same issue exists in JDK 9, but not in JDK 7.

More specifically, as stated in one of the email messages in that thread:

''
I noticed the increase even with Initial and Reserved set to the default of 240 MB, but actual usage much lower (less than a quarter).

Look at this code path. Note that this is invoked at every safepoint (although it says "periodically" in the comment). In the mark_active_nmethods() method, there's a thread iteration in both branches of the if.
....
PS: for comparison, here's data with +TraceSafepointCleanup from JDK 7 (first, where this isn't done)
vs JDK 8 (where this is done) with a program that has a few thousands of threads:

JDK 7:
...
2827.308: [sweeping nmethods, 0.0000020 secs]
2828.679: [sweeping nmethods, 0.0000030 secs]
2829.984: [sweeping nmethods, 0.0000030 secs]
2830.956: [sweeping nmethods, 0.0000030 secs]
...

JDK 8:
...
7368.634: [mark nmethods, 0.0177030 secs]
7369.587: [mark nmethods, 0.0178305 secs]
7370.479: [mark nmethods, 0.0180260 secs]
7371.503: [mark nmethods, 0.0186494 secs]
...
Here's a snapshot of the code cache related counters:

sun.ci.codeCacheCapacity=251658240
sun.ci.codeCacheMaxCapacity=251658240
sun.ci.codeCacheMethodsReclaimedNum=3450
sun.ci.codeCacheSweepsTotalNum=58
sun.ci.codeCacheSweepsTotalTimeMillis=1111
sun.ci.codeCacheUsed=35888704

Notice that the code cache usage is less that 35 MB, for the 240 MB capacity, yet it seems we have had 58 sweeps already, and safepoint cleanup says:

[mark nmethods, 0.0165062 secs]

Even if the two closures do little or no work, the single-threaded walk over deep stacks of a thousand threads will cost time for applications with many threads, and this is now done at each safepoint irrespective of the sweeper activity as far as I can tell. It seems as if this work should be somehow rolled up (via a suitable injection) into GC's thread walks that are done in parallel, rather than doing this in a pre-GC phase (unless I am mssing some reason that the sequencing is necessary, which it doesn't seem to be here).
''

// Scans the stacks of all Java threads and marks activations of not-entrant methods.
// No need to synchronize access, since 'mark_active_nmethods' is always executed at a
// safepoint.
void NMethodSweeper::mark_active_nmethods() {
  assert(SafepointSynchronize::is_at_safepoint(), "must be executed at a safepoint");
  // If we do not want to reclaim not-entrant or zombie methods there is no need
  // to scan stacks
  if (!MethodFlushing) {
    return;
  }

  // Increase time so that we can estimate when to invoke the sweeper again.
  _time_counter++;

  // Check for restart
  assert(CodeCache::find_blob_unsafe(_current) == _current, "Sweeper nmethod cached state invalid");
  if (!sweep_in_progress()) {
    _seen = 0;
    _sweep_fractions_left = NmethodSweepFraction;
    _current = CodeCache::first_nmethod();
    _traversals += 1;
    _total_time_this_sweep = Tickspan();

    if (PrintMethodFlushing) {
      tty->print_cr("### Sweep: stack traversal %d", _traversals);
    }
    Threads::nmethods_do(&mark_activation_closure);

  } else {
    // Only set hotness counter
    Threads::nmethods_do(&set_hotness_closure);
  }

  OrderAccess::storestore();
}

backported by

JDK-8243397 Increased stop time in cleanup phase because of single-threaded walk of thread stacks in NMethodSweeper::mark_active_nmethods()

Closed

JDK-8304788 Increased stop time in cleanup phase because of single-threaded walk of thread stacks in NMethodSweeper::mark_active_nmethods()

Closed

is blocked by

JDK-8184751 Provide thread pool for parallel safepoint cleanup

Closed

relates to

JDK-8211129 compiler/whitebox/ForceNMethodSweepTest.java fails after JDK-8132849

Resolved

JDK-8211392 compiler/profiling/spectrapredefineclass_classloaders/Launcher.java times out in JDK12 CI

Resolved

JDK-8215555 TieredCompilation C2 threads can excessively block handshakes

Resolved

JDK-8211141 jdk/jfr/event/compiler/TestCodeSweeperStats.java fails with Error: Value not atLeast1, field='sweepCount'

Closed

JDK-8180932 Parallelize safepoint cleanup

Resolved

JDK-8184751 Provide thread pool for parallel safepoint cleanup

Closed

(4 relates to)

Details

Backports

Description

Attachments

Issue Links

Activity

People

Dates