Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8294215

CallSiteTargetSelf microbenchmarks regression due changes in nmethod unloading behavior

XMLWordPrintable

      Scores on CallSiteTargetSelf.test* microbenchmarks has regressed compared to JDK 8 and exhibit a behavior where the score gets progressively worse the longer it's run:

      # Benchmark: org.openjdk.bench.java.lang.invoke.CallSiteSetTargetSelf.testMutable

      11.0.16:

      # Run progress: 0,00% complete, ETA 00:16:40
      # Fork: 1 of 5
      # Warmup Iteration 1: 1943,091 ns/op
      # Warmup Iteration 2: 4190,466 ns/op
      # Warmup Iteration 3: 5090,832 ns/op
      # Warmup Iteration 4: 6211,476 ns/op
      # Warmup Iteration 5: 6803,869 ns/op
      Iteration 1: 7338,019 ns/op
      Iteration 2: 8029,986 ns/op
      Iteration 3: 8704,329 ns/op
      Iteration 4: 9223,633 ns/op
      Iteration 5: 10498,911 ns/op

      8.0.345:

      # Run progress: 0,00% complete, ETA 00:16:40
      # Fork: 1 of 5
      # Warmup Iteration 1: 326,132 ns/op
      # Warmup Iteration 2: 317,761 ns/op
      # Warmup Iteration 3: 315,248 ns/op
      # Warmup Iteration 4: 319,478 ns/op
      # Warmup Iteration 5: 317,688 ns/op
      Iteration 1: 324,402 ns/op
      Iteration 2: 314,060 ns/op
      Iteration 3: 327,899 ns/op
      Iteration 4: 324,006 ns/op
      Iteration 5: 319,762 ns/op

      [~vlivanov] did some analysis and showed that this is due to a growing number of dependent nmethods that has to be checked every time the CallSite target changes. JDK 8 more aggressively unload nmethods, and a similar behavior can be provoked on 8 by disabling method unloading (-XX:-MethodFlushing):

      # Run progress: 0,00% complete, ETA 00:16:40
      # Fork: 1 of 5
      # Warmup Iteration 1: 2509,419 ns/op
      # Warmup Iteration 2: 3808,516 ns/op
      # Warmup Iteration 3: 4496,784 ns/op
      # Warmup Iteration 4: 5301,225 ns/op
      # Warmup Iteration 5: 5607,454 ns/op

      Conversely, forcing unloads to happen more often in the microbenchmarks by limiting the code cache makes 11+ exhibit better behavior on the micro:

      11.0.16 with -XX:ReservdCodeCacheSize=3m

      # Run progress: 0,00% complete, ETA 00:16:40
      # Fork: 1 of 5
      # Warmup Iteration 1: 262,584 ns/op
      # Warmup Iteration 2: 262,319 ns/op
      # Warmup Iteration 3: 257,795 ns/op
      # Warmup Iteration 4: 257,002 ns/op
      # Warmup Iteration 5: 258,168 ns/op

      It's unclear how much of a problem this is in practice, but applications relying on mutable CallSites are susceptible and this would be an annoying performance issue in production since application performance might depend heavily on nmethod unloading happen in a timely manner or not [~vlivanov] has suggested we might be able to clean dependency contexts proactively before relevant nmethods get unloaded.

            dlong Dean Long
            redestad Claes Redestad
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated: