Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8227745

Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents



    • Enhancement
    • Resolution: Fixed
    • P4
    • 16
    • None
    • hotspot
    • b21


      Escape analysis (EA) should be enabled for better performance, when the vm is running with JVMTI
      agents loaded.

      Main intent is to be able to start a production system in a mode that allows to
      initiate a debugging session anytime later if necessary or desired without the
      need to disable escape analysis at start-up. In most cases debugging will never
      be activated and the production systems should run at the best possible
      performance while still being ready for debugging. The enhancement will improve
      performance also when a debugger has attached to the vm.

      Another important scenario for the enhancement is heap diagnostics. Agents with
      that purpose need not be loaded at start-up. They can be loadded into a running
      system whenever necessary or desired. Unfortunately the current JVMTI
      implementation does not and cannot give access to scalar replaced objects which
      can hinder diagnostics. JDK-8233915 is an example for this issue that will be
      fixed by this enhancement also.

      Currently EA is disabled if a JVMTI agent added the capability
      can_access_local_variables, because an access to a local reference variable
      potentially changes the escape state of the referenced object and thereby
      invalidates optimizations based on it.

      There are more JVMTI capabilities that allow agents to acquire object references
      from stack frames:

      1. can_access_local_variables
      2. can_get_owned_monitor_info
      3. can_get_owned_monitor_stack_depth_info
      4. can_tag_objects
         This allows for example to walk the object graph beginning at its roots,
         which include local variables.

      JDK-8230677 switches EA off if capabilities 2. or 3. are taken. This workaround is not possible for
      4. as can_tag_objects is an always capability. JDK-8233915 tracks this issue.

      In addition EA is disabled if

      5. can_pop_frame

      is added. Not because it gives access to local variables, but because the
      implementation of PopFrame interferes with object reallocation during
      deoptimization of compiled frames.

      It is likely a bug that EA is not disabled if

      6. can_force_early_return

      is added as ForceEarlyReturn has the same issues with deoptimization.

      This enhancement shall allow the JVM to run with escape analysis enabled even if any of the
      capabilities 1. to 6. is requested by a JVMTI agent.

      Summary of Proposed Implementation

      The JVMTI implementation is changed to revert EA based optimizations just before objects
      escape through JVMTI. At runtime there is no escape information for each object
      in scope. Instead each scope is annotated, if non-escaping objects exist and if
      some are passed as parameters. If a JVMTI agent accesses a reference on stack,
      then the owning compiled frame C is deoptimized, if any non-escaping object is
      in scope. Scalar replaced objects are reallocated on the heap and objects with
      eliminated locking are relocked. This is called "deoptimizing objects" for

      If the agent accesses a reference in a callee frame of C and C is passing any
      non-escaping object as argument then C and its objects are deoptimized as well.

      Deoptimizing Objects

      Early reallocation of scalar replaced (aka virtual) objects, where reallocation
      is done independently of and potentially long before replacing the owning
      compiled frame with equivalent interpreter frames, is a preexisting
      functionality that is leveraged by the enhancement (see

      Reallocating and relocking objects is called "deoptimizing objects".
      Deoptimized objects are kept as deferred updates (preexisting
      JavaThread::_deferred_locals_updates). Either all objects of a compiled frame
      are deoptimized or none. It is annotated at the corresponding deferred updates
      if it happened already in order to avoid doing it twice.


      The class EscapeBarrier is the interface to synchronize and trigger
      deoptimization before objects escape.

      C2 Changes

      During EA C2 annotates each safepoint if it has non-escaping
      objects in scope and each java call if it has non-escaping objects in its
      parameter list.
      This information is persisted in the CompiledMethod's debug information.

      Escape Information at Runtime

      There is preexisting information about scalar replaced objects and eliminated
      locking (note that locks are not only eliminated based on EA, but
      also nested locks are omitted).

      The implementation adds information about non-escping objects in scope and in
      argument lists at call sites:



      Competing agents use the new flag '_obj_deopt' in Thread::_suspend_flags and
      the new Monitor EscapeBarrier_lock to synchronize and to suspend their
      target thread.

      Deoptimization can be concurrent for different target threads.

      A self deoptimization cannot be concurrent with other deoptimizations.

      Deoptimizing everything (e.g. before heap walks) cannot be concurrent with other

      See EscapeBarrier::sync_and_suspend_one() and EscapeBarrier::sync_and_suspend_all()

      PopFrame and ForceEarlyReturn

      Objects are deoptimized before the PopFrame/ForceEarlyReturn operation and
      JVMTI_ERROR_OUT_OF_MEMORY is returned if reallocations fail. This avoids
      reallocation failures during the operation.


      Performance should not be affected if no JVMTI agent is loaded.

      If a JVMTI agent is loaded that adds any of the capabilities listed above, but
      remains inactive, then there should be a performance gain as high as the gain of

      The performance impact is expected to be still positive if debugging interactively.

      jvm2008 results are attated to the RFE.

      Microbenchmark results: http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.6.microbenchmark/


      The proposed implementation comes with a significant abount of dedicated test

      The new develop flag DeoptimizeObjectsALot allows for stress testing, where
      internal threads are started that deoptimize frames and objects in millisecond
      intervals given with DeoptimizeObjectsALotInterval. The number of threads
      started are given with DeoptimizeObjectsALotThreadCountAll and
      DeoptimizeObjectsALotThreadCountSingle. The former targets all existing threads
      whereas the latter operates on a single thread selected round robin.


        Issue Links



              rrich Richard Reingruber
              rrich Richard Reingruber
              0 Vote for this issue
              14 Start watching this issue