Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8285447

StackWalker minimal batch size should be optimized for getCallerClass

    XMLWordPrintable

    Details

      Description

      In a simple benchmark like:

      @Warmup(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
      @Measurement(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
      @Fork(value = 3, jvmArgsAppend = {"-Xmx1g", "-Xms1g"})
      @State(Scope.Benchmark)
      @BenchmarkMode(Mode.AverageTime)
      @OutputTimeUnit(TimeUnit.MICROSECONDS)
      public class CallerClassBench {

          static final StackWalker INST = StackWalker.getInstance(StackWalker.Option.RETAIN_CLASS_REFERENCE);

          @Benchmark
          public Class<?> stackWalker() {
              return INST.getCallerClass();
          }
      }

      ...it becomes quickly evident that MIN_BATCH_SIZE of 8 is too much, since we only reach for 4-th frame on the stack. Dropping it to 4 like this:

      $ git diff
      diff --git a/src/java.base/share/classes/java/lang/StackStreamFactory.java b/src/java.base/share/classes/java/lang/StackStreamFactory.java
      index 22cbb2e170a..7abf6c88aaf 100644
      --- a/src/java.base/share/classes/java/lang/StackStreamFactory.java
      +++ b/src/java.base/share/classes/java/lang/StackStreamFactory.java
      @@ -65,10 +65,15 @@ final class StackStreamFactory {
           // lazily add subclasses when they are loaded.
           private static final Set<Class<?>> stackWalkImplClasses = init();
       
      - private static final int SMALL_BATCH = 8;
      + // Minimum batch size for any walker. The shortest walk is for getCallerClass,
      + // which would need to skip a few of StackWalker own frames.
      + private static final int MIN_BATCH_SIZE = 4;
      +
      + // Heuristic sizes to balance the stack walk costs, for both smaller and
      + // larger stacks.
      + private static final int SMALL_BATCH = MIN_BATCH_SIZE;
           private static final int BATCH_SIZE = 32;
           private static final int LARGE_BATCH_SIZE = 256;
      - private static final int MIN_BATCH_SIZE = SMALL_BATCH;
       
           // These flags must match the values maintained in the VM
           @Native private static final int DEFAULT_MODE = 0x0;

      ...improves the getCallerClass performance significantly:
       before: 0.884 ± 0.036 us/op
       after: 0.461 ± 0.012 us/op

        Attachments

          Activity

            People

            Assignee:
            shade Aleksey Shipilev
            Reporter:
            shade Aleksey Shipilev
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Dates

              Created:
              Updated: