Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8267532

C2: Profile and prune untaken exception handlers

    XMLWordPrintable

Details

    • b26

    Description

      Disclaimer: while this investigation was motivated by Project Panama, I can reproduce the findings on a vanilla JDK.

      There seems to be an issue when calling a close() method in a try/catch or try/finally block. In other words, this idiom:

      Resource r = createResource()
      doSomething(r);
      r.close();

      Is significantly faster than this other idiom:

      Resource r = createResource()
      try {
         doSomething(r);
         r.close();
      } finally {
         r.close();
      }

      or this idiom:

      Resource r = createResource()
      try {
         doSomething(r);
         r.close();
      } catch (Throwable t) {
         r.close();
         throw t;
      }

      The attached benchmark can be used to reproduce these findings:

      Benchmark Mode Cnt Score Error Units
      ResourceScopeCloseMin.confined_close avgt 30 14.398 ? 0.175 ns/op
      ResourceScopeCloseMin.confined_close_notry avgt 30 6.375 ? 0.053 ns/op

      I've tried to remove all factors which could lead to potential confusion in JIT, such as interfaces, or javac-generated try-with-resources blocks (which contain extra calls to add suppressed exceptions which might have skewed the benchmark).

      If I run with -XX:+PrintInlining I always see something like this:

                                    @ 25 org.openjdk.bench.jdk.incubator.foreign.ResourceScopeCloseMin$ConfinedScope::close (48 bytes) already compiled into a medium method

      And, inspecting the bytecode reveals that @25 is indeed the BCI of the close() call inside the catch block. Commenting out that call makes everything fast again.

      ---

      The issue is essentially that the second call to close() in the finally/catch block is never reached, so C2 doesn't inline the call, which makes the Resource object escape, and not scalarizable. This leads then to the difference in performance. C2 normally replaces infrequently taken branches with an uncommon trap, but doesn't use an uncommon trap for the finally/catch blocks, since there is currently no information available to indicate whether finally/catch blocks are taken or not. They are entered through exception dispatch rather than a regular branch through the 'goto' bytecode (which does have profiling implemented for it).

      To proposed solution is to implement profiling for exception handlers (which is the underlying bytecode implementation of the finally/catch blocks), and then use that profiling information in C2 to emit uncommon traps in place of exception handlers that are never entered. In the examples above, this would eliminate the second call to close() altogether, meaning the resource object also does not escape and can be scalar replaced. This brings the performance on par with the base case.

      Attachments

        Issue Links

          Activity

            People

              jvernee Jorn Vernee
              mcimadamore Maurizio Cimadamore
              Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: