Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8177802

finally blocks can be bypassed due to StackOverflowError

XMLWordPrintable

    • generic
    • generic

      FULL PRODUCT VERSION :
      openjdk version "1.8.0_121"
      OpenJDK Runtime Environment (build 1.8.0_121-8u121-b13-0ubuntu1.16.04.2-b13)
      OpenJDK 64-Bit Server VM (build 25.121-b13, mixed mode)

      openjdk version "9-internal"
      OpenJDK Runtime Environment (build 9-internal+0-2016-04-14-195246.buildd.src)
      OpenJDK 64-Bit Server VM (build 9-internal+0-2016-04-14-195246.buildd.src, mixed mode)

      java version "1.8.0_111"
      Java(TM) SE Runtime Environment (build 1.8.0_111-b14)
      Java HotSpot(TM) 64-Bit Server VM (build 25.111-b14, mixed mode)

      java version "1.7.0_111"
      OpenJDK Runtime Environment (IcedTea 2.6.7) (7u111-2.6.7-2~deb8u1)
      OpenJDK 64-Bit Server VM (build 24.111-b01, mixed mode)

      ADDITIONAL OS VERSION INFORMATION :
      Linux stefan-y260 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

      Linux stefan-x200 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt7-1 (2015-03-01) x86_64 GNU/Linux

      Microsoft Windows [Version 10.0.14393]

      A DESCRIPTION OF THE PROBLEM :
      ********************************************************************************************************************************************************************************************
      **** Please see the comments section of the bug report , the submitter agrees this is not a bug but requested for the following enhancement ***********
      "JVM should have some additional emergency stack space that only becomes available right after a StackOverflowError was caught, so in a catch or finally block, basic things like System.out.println, close() or unlock() calls would be guaranteed to work (e.g. the emergency stack memory should have space for a call depth of 10 or so).
      This would be of enourmous help for developers that happen to debug such pathological scenarios. "

      **********************************************************************************************************************************************************************************************

      If a StackOverflowError is caught, then an adjacent finally block can be bypassed (see "code for an executable test case" for an example).

      The concern here is that - while the StackOverflowError most likely points to a bug - this construct allows to resume normal execution while some finally-clean-up-code has been skipped. Specifically we observed this to cause a lock not to be released, leading to a deadlock much later in a seemingly unrelated context. This issue can thus yield very subtle, hard to find bugs, including deadlocks and freezes that are not tractable via jstack output.

      The described behavior was discovered by running the Jython test-suite and is discussed in ticket http://bugs.jython.org/issue2536. The setting involves guava, which uses ReentrantLock to provide multithreaded data structures of various kind. In case of a StackOverflow error, such a lock might not get unlocked, while the error is silently swallowed and normal execution resumes. Many tests later the unreleased lock can yield a deadlock. We observed several variants how such a deadlock can occur, e.g. http://bugs.jython.org/issue2565.

      Even if one would assume it to be okay behavior of a JVM, given that a stack overflow can prevent normal operation, this would be at least a documentation issue, because https://docs.oracle.com/javase/tutorial/essential/exceptions/finally.html does not mention this scenario as a possible cause for bypassing a finally block.

      I suspect, a similar issue can be triggered by an OutOfMemoryError, but don't have a proof so far. http://stackoverflow.com/questions/22969000/finally-block-will-be-executed-in-case-of-outofmemoryerror indicates this.

      STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
      Run the "code for an executable test case" below several times.
      Most of the times it will yield the output explained in "Expected Result".

      From time to time (usually within less than 10 tries, so it is a matter of seconds to provoke this) you will get the output described in "Actual Result".

      Alternatively (but far more tedious) you can get the original setup that led me to discover this issue at https://github.com/Stewori/Jython_debug_2536; run test_loop.sh until it hangs.
      Note that it bundles a slightly modified debug-version of guava, which you can study at https://github.com/Stewori/guava_debug. Specifically, modifications to original guava can be viewed at https://github.com/Stewori/guava_debug/commit/94a2c646a98674dda4e5d946569f28855105720d.

      EXPECTED VERSUS ACTUAL BEHAVIOR :
      EXPECTED -
      In the end it prints out two checksums, which must be equal if the finally block was properly executed every time, e.g.

      ...
      recursion 4479
      recursion 4480
      recursion 4481
      recursion 4482java.lang.StackOverflowError
      4481
      10041921
      10041921
      continue normal execution...

      ACTUAL -
      If this fails you will see something like

      ...
      recursion 4607
      recursion 4608
      recursion 4609
      recursion 4610
      recursion 4611java.lang.Stacjava.lang.StackOverflowErrorjava.lang.StackOverflowErrorjava.lang.StackOverflowError
      4608
      10619136
      10628355
      continue normal execution...

      If the checksums (10619136, 10628355) differ, some finally blocks were bypassed; note that the difference is 4609+4610, which belong to the frames that were also dropped by the catch: It printed 4608. The construct as a whole however still resumes normal execution and the lack of resource cleanup is silently swallowed.

      ERROR MESSAGES/STACK TRACES THAT OCCUR :
      This bug triggers no explicit error message.

      REPRODUCIBILITY :
      This bug can be reproduced occasionally.

      ---------- BEGIN SOURCE ----------
      public class Test {
      public static long sum = 0, sum2 = 0;
      public static void recursion(int val) {
      System.out.println("recursion "+val);
      sum += val;
      sum2 += val;
      try {
      recursion(val+1);
      } catch (StackOverflowError soe) {
      System.out.println(soe);
      System.out.println(val);
      System.out.println(sum);
      System.out.println(sum2);
      } finally {
      sum -= val;
      }
      }

      public static void main(String[] args) {
      System.out.println("Start...");
      recursion(0);
      System.out.println("continue normal execution...");
      }
      }
      ---------- END SOURCE ----------

      CUSTOMER SUBMITTED WORKAROUND :
      Select a higher value for -xss such that no overflow occurs.

      Avoid to catch StackOverflowError or one of its base classes (Error or Throwable) in combination with a finally block.

      Note that if in the test case, the StackOverflowError is not explicitly caught (i.e. also none of its baseclasses error or Throwable) inside of the recursion, I never observed the finally block to be bypassed. One can still resume normal execution by catching StackOverflowError outside of the recursion; this seems not to ever bypass the finally block.

      However this requires code changes and the described construct is present in manifold ways e.g. in guava, which is widely used.

            Unassigned Unassigned
            webbuggrp Webbug Group
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated: