Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8284289

JEP 435: Asynchronous Stack Trace VM API

    XMLWordPrintable

Details

    • JEP
    • Status: Candidate
    • P4
    • Resolution: Unresolved
    • None
    • hotspot
    • None
    • Johannes Bechberger
    • Feature
    • Open
    • svc
    • JDK
    • serviceability dash dev at openjdk dot org
    • S
    • S
    • 435

    Description

      Summary

      Define an efficient and reliable API to collect stack traces asynchronously and include information on both Java and native stack frames.

      Goals

      • Provide a well-tested API for profilers to obtain information on Java and native frames.

      • Support asynchronous usage, e.g., calling from signal handlers.

      • Do not affect performance when the API is not in use.

      • Do not significantly increase memory requirements compared to the existing AsyncGetCallTrace API.

      Non-Goals

      • It is not a goal to recommend the new API for production use, since it can crash the VM. We will minimize the chances of that via extensive testing and fuzzing.

      Motivation

      The AsyncGetCallTrace API is used by almost all available profilers, both open-source and commercial, including, e.g., async-profiler. Yet it has two major disadvantages:

      • It is an internal API, not exported in any header, and
      • It only returns information about Java frames, namely their method and bytecode indices.

      These issues make implementing profilers and related tooling more difficult. Some additional information can be extracted from the HotSpot VM via complex code, but other useful information is hidden and impossible to obtain:

      • Whether a compiled Java frame is inlined (currently only obtainable for the topmost compiled frames),
      • The compilation level of a Java frame (i.e., compiled by C1 or C2), and
      • Information on C/C++ frames that are not at the top of the stack.

      Such data can be helpful when profiling and tuning a VM for a given application, and for profiling code that uses JNI heavily.

      Description

      We propose a new AsyncGetStackTrace API, modeled on the AsyncGetCallTrace API:

      void AsyncGetStackTrace(CallTrace *trace, jint depth, void* ucontext,
                              uint32_t options);

      This API can be called by profilers to obtain the stack trace for the current thread. Calling this API from a signal handler is safe, and the new implementation will be at least as stable as AsyncGetCallTrace or the JFR stack walking code. The VM fills in information about the frames and the number of frames. The caller of the API should allocate the CallTrace array with sufficient memory for the requested stack depth.

      Parameters:

      • trace — buffer for structured data to be filled in by the VM
      • depth — maximum depth of the call stack trace
      • ucontext — optional ucontext_t of the current thread when it was interrupted
      • options — bit set for options

      Currently only the lowest bit of the options is considered: It enables (1) or disables (0) the inclusion of C/C++ frames. All other bits are considered to be 0.

      The trace struct

      typedef struct {
        jint num_frames;                // number of frames in this trace
        CallFrame *frames;              // frames
        void* frame_info;               // more information on frames
      } CallTrace;

      is filled in by the VM. Its num_frames field contains the actual number of frames in the frames array or an error code. The frame_info field in that structure can later be used to store more information, but is currently NULL.

      The error codes are a subset of the error codes for AsyncGetCallTrace, with the addition of THREAD_NOT_JAVA related to calling this procedure for non-Java threads:

      enum Error {
        NO_JAVA_FRAME         =   0,
        NO_CLASS_LOAD         =  -1, 
        GC_ACTIVE             =  -2,    
        UNKNOWN_NOT_JAVA      =  -3,
        NOT_WALKABLE_NOT_JAVA =  -4,
        UNKNOWN_JAVA          =  -5,
        UNKNOWN_STATE         =  -7,
        THREAD_EXIT           =  -8,
        DEOPT                 =  -9,
        THREAD_NOT_JAVA       = -10
      };

      Every CallFrame is the element of a union, since the information stored for Java and non-Java frames differs:

      typedef union {
        FrameTypeId type;     // to distinguish between JavaFrame and NonJavaFrame 
        JavaFrame java_frame;
        NonJavaFrame non_java_frame;
      } CallFrame;

      There a several distinguishable frame types:

      enum FrameTypeId : uint8_t {
        FRAME_JAVA         = 1, // JIT compiled and interpreted
        FRAME_JAVA_INLINED = 2, // inlined JIT compiled
        FRAME_NATIVE       = 3, // native wrapper to call C methods from Java
        FRAME_STUB         = 4, // VM generated stubs
        FRAME_CPP          = 5  // C/C++/... frames
      };

      The first two types are for Java frames, for which we store the following information in a struct of type JavaFrame:

      typedef struct {     
        FrameTypeId type;       // frame type
        int8_t comp_level;      // compilation level, 0 is interpreted
        uint16_t bci;           // 0 < bci < 65536
        jmethodID method_id;
      } JavaFrame;              // used for FRAME_JAVA, FRAME_JAVA_INLINED and FRAME_NATIVE

      The comp_level indicates the compilation level of the method related to the frame, with higher numbers representing higher levels of compilation. It is modeled after the <code class="prettyprint" data-shared-secret="1670272882397-0.52225496760177">CompLevel</code> enum in HotSpot but is dependent on the compiler infrastructure used. A value of zero indicates no compilation, i.e., bytecode interpretation.

      Information on all other frames is stored in NonJavaFrame structs:

      typedef struct {
        FrameTypeId type;  // frame type
        void *pc;          // current program counter inside this frame
      } NonJavaFrame;  

      Although the API provides more information, the amount of space required per frame (e.g., 16 bytes on x86) is the same as for the existing AsyncGetCallTrace API.

      We propose to place the above declarations in a new header file, profile.h, which will be placed in the include directory of the JDK image. The header’s license should include the Classpath Exception so that it is consumable by third-party profiling tools.

      A prototype implementation can be found here, and a demo combining it with a modified async-profiler can be found here.

      Risks and Assumptions

      Returning information on C/C++ frames leaks implementation details, but this is also true for the Java frames of AsyncGetCallTrace since they leak details of the implementation of standard library files and include native wrapper frames.

      Testing

      We will add new stress tests to identify stability problems on all supported platforms. We plan to profile a set of example programs (e.g., the DaCapo and Renaissance benchmark suites) repeatedly with small profiling intervals (<= 0.1ms). We will also add substantial unit tests which should cover all options and test the basic usage of the API.

      Attachments

        Issue Links

          Activity

            People

              jbechberger Johannes Bechberger
              jbechberger Johannes Bechberger
              Christoph Langer Christoph Langer
              Andrei Pangin, Christoph Langer, Jaroslav Bachorík
              Votes:
              0 Vote for this issue
              Watchers:
              15 Start watching this issue

              Dates

                Created:
                Updated: