Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8328610

JFR Method Timing & Tracing

XMLWordPrintable

    • Icon: JEP JEP
    • Resolution: Unresolved
    • Icon: P3 P3
    • None
    • hotspot
    • None
    • Erik Gahlin
    • Feature
    • Open
    • jfr
    • JDK
    • S
    • S

      Summary

      Extend JDK Flight Recorder (JFR) to enable bytecode-based method timing and tracing for quick and easy use.

      Goals

      • Gather method invocation statistics that are not sample-based.
      • Allow execution time and stack traces to be collected for specific methods without requiring source code modifications.
      • Allow method selection via command-line arguments, jcmd commands, configuration files, and over the network through Java Management Extensions (JMX).
      • Display results in the shell using the jfr view command for recording files and on a live JVM using the jcmd JFR.view command.

      Non-Goals

      • It is not a goal to collect method parameters and instance variables.
      • It is not a goal to select methods that lack a bytecode representation, such as abstract, native, or non-static, non-default interface methods.
      • It is not a goal to trace a large number of classes simultaneously, as this would significantly degrade performance. Method sampling is preferred for such use cases.
      • It is not a goal to stay within JFR's current default goal of less than 1% CPU overhead when method timing and tracing is enabled.

      Motivation

      Tracing and timing method invocations help developers identify performance bottlenecks, find the root causes of bugs, and optimize code. For instance, if an application takes an unusually long time to start, tracing static initializers can reveal class loading that could be deferred to a later stage. Another example is when optimizing a method. Timing its execution can confirm whether the code change has actually improved performance. A third example, an application may fail due to resource exhaustion, such as running out of database connections. By tracing the method responsible for opening a connection, developers can pinpoint where the database is accessed and take steps to manage resources more efficiently.

      During development, developers use various tools to analyze method execution, including:

      • Java Microbenchmark Harness (JMH) to measure method execution times.
      • Java Debug Interface (JDI) capable debuggers to set breakpoints and inspect the call stack.

      However, in production and test environments outside of the developer machine, timing and tracing pose challenges. Current approaches include:

      • Adding temporary log statements or JFR events to methods under investigation, a practice that is cumbersome at best and not feasible for third-party libraries or classes in the JDK.
      • Using sample-based profilers, such as the one built into JFR or third-party tools like Async-profiler, which capture stack traces for frequently executed methods but fall short of tracing and timing every invocation.
      • Using a Java Agent, such as the one employed by JDK Mission Control (JMC), to inject JFR events into Java code. While this approach works, performing instrumentation directly within the JDK has performance benefits. For instance, the JVM can filter methods, eliminating the need to parse the bytecode of every loaded class twice. Having this feature in the JDK also simplifies usage, as developers can specify methods to be traced without having to configure or install an agent.

      The JDK should provide means to time and trace method invocations with low overhead, allowing developers to easily and efficiently analyze and optimize their applications.

      Description

      The feature will be implemented as two new events, jdk.MethodTrace and jdk.MethodTiming, which both accept a filter to select the methods to trace and time. For example, to see what triggered a HashMap resize, the filter setting of the MethodTrace event could be set like this:

      $ java -XX:StartFlightRecording:jdk.MethodTrace#filter=java.util.HashMap::resize,filename=recording.jfr ...
      $ jfr print --events jdk.MethodTrace --stack-depth 20 recording.jfr
      $ jdk.MethodTrace {
          startTime = 00:39:26.379 (2025-03-05)
          duration = 0.00113 ms
          method = java.util.HashMap.resize()
          eventThread = "main" (javaThreadId = 3)
          stackTrace = [
            java.util.HashMap.putVal(int, Object, Object, boolean, boolean) line: 636
            java.util.HashMap.put(Object, Object) line: 619
            sun.awt.AppContext.put(Object, Object) line: 598
            sun.awt.AppContext.<init>(ThreadGroup) line: 240
            sun.awt.SunToolkit.createNewAppContext(ThreadGroup) line: 282
            sun.awt.AppContext.initMainAppContext() line: 260
            sun.awt.AppContext.getAppContext() line: 295
            sun.awt.SunToolkit.getSystemEventQueueImplPP() line: 1024
            sun.awt.SunToolkit.getSystemEventQueueImpl() line: 1019
            java.awt.Toolkit.getEventQueue() line: 1375
            java.awt.EventQueue.invokeLater(Runnable) line: 1257
            javax.swing.SwingUtilities.invokeLater(Runnable) line: 1415
            java2d.J2Ddemo.main(String[]) line: 674
      ] }

      To simplify usage, the two XML configuration files (default.jfc and profile.jfc) located in JDK_HOME/lib/jfr will be enhanced with two new options: method-trace and method-timing. These options will control the filter settings for the MethodTrace and MethodTiming events. Specifically, the method-trace option will record a stack trace for methods that match the filter, while the method-timing option will count the number of invocations and calculate the average execution time for a method. These options can be used on the command line when starting Java, or with jcmd on an already running JVM. If no configuration file is specified, the options will control settings in default.jfc.

      Additionally, the jfr view and jcmd <pid> JFR.view commands will be enhanced so both method tracing and timing results can be displayed in the shell.

      For example, if an application suffers from slow startup, timing the execution of all static initializers may give an idea of where lazy initialization could be used. By omitting the class name and using <clinit> as the method name in the filter, all static initializers are timed.

      $ java '-XX:StartFlightRecording:method-timing=::<clinit>,filename=clinit.jfr' ...
      $ jfr view method-timing clinit.jfr
                                                  Method Timing
      
      Timed Method                                                                 Invocations Average Time
      ---------------------------------------------------------------------------- ----------- ------------
      sun.font.HBShaper.<clinit>()                                                           1 32.500000 ms
      java.awt.GraphicsEnvironment$LocalGE.<clinit>()                                        1 32.400000 ms
      java2d.DemoFonts.<clinit>()                                                            1 21.200000 ms
      java.nio.file.TempFileHelper.<clinit>()                                                1 17.100000 ms
      sun.security.util.SecurityProviderConstants.<clinit>()                                 1  9.860000 ms
      java.awt.Component.<clinit>()                                                          1  9.120000 ms
      sun.font.SunFontManager.<clinit>()                                                     1  8.350000 ms
      sun.java2d.SurfaceData.<clinit>()                                                      1  8.300000 ms
      java.security.Security.<clinit>()                                                      1  8.020000 ms
      sun.security.util.KnownOIDs.<clinit>()                                                 1  7.550000 ms
      ...

      If the constructor of a class needs to be traced or timed, <init> is used as the method name. Besides a method, the filter can be a class, in which case all the methods in that class are timed or traced. The filter can also be an annotation, which can be applied to either a method or a class. If applied to a class, all its methods will be traced or timed. For example, to see the number of times a REST endpoint is invoked and measure the approximate execution time:

      $ jcmd <pid> JFR.start method-timing=@jakarta.ws.rs.GET

      Multiple filters can be specified, with semicolons as a delimiter. In the following example, the creation and destruction of file descriptors are traced. By looking at where the events do not match up, it is possible to find where file descriptors are leaking.

      $ java '-XX:StartFlightRecording:filename=fd.jfr,method-trace=java.io.FileDescriptor::<init>;java.io.FileDescriptor::close' ...
      $ jfr view --cell-height 10 MethodTrace fd.jfr
      
                                                        Method Trace
      
      Start Time    Duration Event Thread     Stack Trace                                     Method                         
      ---------- ----------- ---------------- ----------------------------------------------- -------------------------------                             
      10:50:11   0.000167 ms AWT-EventQueue-0 java.io.FileInputStream.<init>(File)            java.io.FileDescriptor.<init>()
                                              sun.security.provider.FileInputStreamPool.ge...                                
                                              sun.security.provider.NativePRNG$RandomIO.<i...                                
                                              sun.security.provider.NativePRNG.initIO(...)                                   
                                              sun.security.provider.NativePRNG.<clinit>()                                    
                                              sun.security.provider.SunEntries.<clinit>()                                    
                                              sun.security.provider.Sun.<init>()                                             
                                              sun.security.jca.ProviderConfig.getProvider()                                  
                                              sun.security.jca.ProviderList.getProvider(int)                                 
                                              sun.security.jca.ProviderList$2.get(int)                                                                                                                                
      10:50:20    0.00113 ms AWT-EventQueue-0 java.io.FileInputStream.<init>(File)            java.io.FileDescriptor.<init>()
                                              java.io.FileInputStream.<init>(String)                                         
                                              java.io.FileReader.<init>(String)                                              
                                              java2d.demos.Mix.BezierScroller.getFile()                                      
                                              java2d.demos.Mix.BezierScroller.step(int, int)                                 
                                              java2d.Surface.paint(Graphics)                                                 
                                              javax.swing.JComponent.paintChildren(Graphics)                                 
                                              javax.swing.JComponent.paint(Graphics)                                         
                                              javax.swing.JComponent.paintChildren(Graphics)                                 
                                              javax.swing.JComponent.paint(Graphics)                         
      ...

      The new grammar for the filter setting, used by the method-trace and method-timing options, is:

      filter ::= target (";" target)*
      target ::= class | class-method | method | annotation
      class ::= identifier ("." identifier)*
      class-method ::= class method
      method ::= "::" method-name
      method-name ::= identifier | "<clinit>" | "<init>"
      annotation ::= "@" class
      identifier ::= see JLS 3.8

      Users who do not want to use the JFR API to add events to their code, perhaps because they think it is noisy or because they want to avoid a dependency on the jdk.jfr module, can create custom annotations, for example:

      @Retention(RUNTIME)
      @Target({ TYPE, METHOD })
      public @interface StopWatch {
      }
      
      @Retention(RUNTIME)
      @Target({ TYPE, METHOD })
      public @interface Initialization {
      }
      
      @Retention(RUNTIME)
      @Target({ TYPE, METHOD })
      public @interface Debug {
      }

      These annotations can be added to classes or methods that are worth investigating. When there is a need to troubleshoot the application, the appropriate annotation can be specified, and all the annotated methods will be timed or traced, for example:

      $ java -XX:StartFlightRecording:method-trace=@com.example.Debug. ...
      $ java -XX:StartFlightRecording:method-timing=@com.example.Initialization,@com.example.StopWatch ...

      The resulting MethodTrace and MethodTiming events can be correlated with other events that JFR emits, such as events for lock contention, method sampling, or I/O, to understand why the annotated parts of the program were slow.

      For configuration over the network, JMX and the RemoteRecordingStream class in the jdk.management.jfr module can be used. The following example demonstrates a non-secure connection to a remote host, where timing and tracing are enabled for 60 seconds. During this period, MethodTrace and MethodTiming events can be subscribed to using the onEvent method.

      String url = “service:jmx:rmi:///jndi/rmi://example.com:7091/jmxrmi”;
      JMXServiceURL jmxURL = new JMXServiceURL(url);
      JMXConnector connector = JMXConnectorFactory.connect(jmxURL);
      try (var r = new RemoteRecordingStream(connector.getMBeanServerConnection())) {
          Map<String, String> settings = new HashMap<>();
          // Trace methods in class com.foo.Bar that takes more than 1 ms to execute
          settings.put("jdk.MethodTrace#enabled", "true");
          settings.put("jdk.MethodTrace#stackTrace", "true");
          settings.put("jdk.MethodTrace#threshold", "1 ms");
          settings.put("jdk.MethodTrace#filter", "com.foo.Bar");
          r.onEvent("jdk.MethodTrace", event -> ...);
          // Measure execution time and invocation count for methods with 
          // the jakarta.ws.rs.GET annotation and emit the result every 10 s.
          settings.put("jdk.MethodTiming#enabled", "true");
          settings.put("jdk.MethodTiming#filter", "@jakarta.ws.rs.GET");
          settings.put("jdk.MethodTiming#period", "10 s");
          r.onEvent("jdk.MethodTiming", event -> ...);    
          r.setSettings(settings);
          r.startAsync();
          Thread.sleep(60_000);
          r.stop();
      }
      connector.close();

      After the 60-second interval, the RemoteRecordingStream is stopped, and the injected bytecode for the MethodTrace and MethodTiming events is removed. The RemoteRecordingStream then waits for any pending events to be consumed. Once complete, the stream is closed, and the recording data on the remote host is discarded, restoring the system to its original state. For in-process use, the RecordingStream class can be used to consume MethodTiming and MethodTracing events in a similar manner, for example in a Java agent.

      In case multiple recordings are running at the same time, either started remotely or locally, the union of all the filters will be applied.

      In scenarios where more than a few methods should be traced, a separate .jfc file can be created:

      <?xml version="1.0" encoding="UTF-8"?>
      <configuration version="2.0">
        <event name="jdk.MethodTiming">
          <setting name="enabled">true</setting>
          <setting name="filter">
            com.example.Foo::method1;
            com.example.Bar::method2;
            ...
            com.example.Baz::method17
          </setting>
        </event>
      </configuration>

      It can then be used in conjunction with the default settings:

      $ java -XX:StartFlightRecording:settings=timing.jfc,settings=default ... 

      Alternatives

      • We considered recording method parameters and instance variables but rejected this approach because it turns JFR into an attack vector for extracting sensitive information, either programmatically in an unvetted third-party library or through a maliciously crafted .jfc file.

      • Sometimes, developers cannot determine the specific class to instrument, such as an implementation of java.sql.Connection or java.lang.System.Logger. In these cases, specifying the interface provides a convenient way to instrument all implementing classes. This functionality could become part of a future enhancement without altering the existing design.

      • We designed the filter grammar not to accept wildcards to prevent an excessive number of instrumented methods, which could bring the application to a halt. To maintain a simple filter syntax, we chose not to support method overloads that require comma-separated parameters, as they conflict with those used for option separation.

      • The feature is not intended to be used on a large number of methods due to the high overhead it would impose. To prevent misuse, we considered limiting the number of methods that can be instrumented but decided against it. There are scenarios where it may be acceptable, such as timing every class initializer, which only occurs once in an application's lifecycle.

      Risks and Assumptions

      Specifying a filter that includes methods used by the added instrumentation could lead to infinite recursion. This can be mitigated by not allowing tracing of classes in the jdk.jfr module and some packages in java.base.

            mr Mark Reinhold
            egahlin Erik Gahlin
            Erik Gahlin Erik Gahlin
            Markus Grönlund, Vladimir Kozlov
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated: