Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8328610

JEP 520: JFR Method Timing & Tracing

XMLWordPrintable

    • Feature
    • Open
    • jfr
    • JDK
    • hotspot dash jfr dash dev at openjdk dot org
    • S
    • S
    • 520

      Summary

      Extend the JDK Flight Recorder (JFR) with facilities for method timing and tracing via bytecode instrumentation.

      Goals

      Non-Goals

      • It is not a goal to record method arguments or the values of non-static fields.

      • It is not a goal to time or trace methods that do not have a bytecode representation, such as abstract, native, or non-static non-default interface methods.

      • It is not a goal to time or trace a large number of methods simultaneously, since that would significantly degrade performance. Use method sampling in such cases.

      • JFR generally aims to impose a CPU overhead of less than one percent. It is not a goal to remain within this constraint when timing and tracing methods.

      Motivation

      Timing and tracing method invocations can help to identify performance bottlenecks, optimize code, and find the root causes of bugs. For example, if an application takes an unusually long time to start, tracing static initializers can reveal class loading that could be deferred to a later stage. If a method was changed to fix a performance bug, timing its execution can confirm that the fix was successful. If an application fails because it runs out of database connections, tracing the method that opens connections can suggest how to manage those connections more effectively.

      During development, we already have good tools for analyzing method execution: The Java Microbenchmark Harness (JMH) can measure method execution times, and debuggers built on top of the Java Debug Interface (JDI) can set breakpoints and inspect call stacks.

      During testing and production, however, timing and tracing pose challenges. There are several approaches, none fully satisfactory:

      • Add temporary log statements or JFR events to the methods under investigation. This is cumbersome at best, and not feasible for third-party libraries or classes in the JDK.

      • Use sample-based profilers, such as the one built into JFR or third-party tools such as async-profiler. These tools capture stack traces for frequently executed methods, but they fall short of timing and tracing every invocation.

      • Use a Java agent, such as the one employed by the JDK Mission Control tool, to instrument methods to emit JFR events. While this approach works, doing such instrumentation from within the JDK would have performance benefits. For example, the JVM can filter methods, eliminating the need to parse the bytecode of every loaded class twice. Having this feature in the JDK would also simplify usage, since methods could be timed and traced without having to configure or install an agent.

      The JDK should provide a way to time and trace method invocations with low overhead, enabling developers to easily and efficiently analyze and optimize their applications.

      Description

      We introduce two new JFR events, jdk.MethodTiming and jdk.MethodTrace. They both accept a filter to select the methods to time and trace.

      For example, to see what triggers the resize of a HashMap, you can configure the MethodTrace event's filter when making a recording and then use the <code class="prettyprint" data-shared-secret="1748626205066-0.15088088780569286">jfr</code> tool to display the recorded event:

      $ java -XX:StartFlightRecording:jdk.MethodTrace#filter=java.util.HashMap::resize,filename=recording.jfr ...
      $ jfr print --events jdk.MethodTrace --stack-depth 20 recording.jfr
      jdk.MethodTrace {
          startTime = 00:39:26.379 (2025-03-05)
          duration = 0.00113 ms
          method = java.util.HashMap.resize()
          eventThread = "main" (javaThreadId = 3)
          stackTrace = [
            java.util.HashMap.putVal(int, Object, Object, boolean, boolean) line: 636
            java.util.HashMap.put(Object, Object) line: 619
            sun.awt.AppContext.put(Object, Object) line: 598
            sun.awt.AppContext.<init>(ThreadGroup) line: 240
            sun.awt.SunToolkit.createNewAppContext(ThreadGroup) line: 282
            sun.awt.AppContext.initMainAppContext() line: 260
            sun.awt.AppContext.getAppContext() line: 295
            sun.awt.SunToolkit.getSystemEventQueueImplPP() line: 1024
            sun.awt.SunToolkit.getSystemEventQueueImpl() line: 1019
            java.awt.Toolkit.getEventQueue() line: 1375
            java.awt.EventQueue.invokeLater(Runnable) line: 1257
            javax.swing.SwingUtilities.invokeLater(Runnable) line: 1415
            java2d.J2Ddemo.main(String[]) line: 674
      ] } $

      The filter is specified as java.util.HashMap::resize, just like a method reference in source code. As the JVM starts up, it instruments this method by injecting bytecode to emit a MethodTrace event.

      Use via configuration files

      In practice, JFR is usually configured via a configuration file rather than by explicitly mentioning events on the command line. The JDK includes two configuration files, default.jfc and profile.jfc; the former is used when no configuration file is specified, while the latter contains convenient configuration settings for profiling. Such files can declare options which can be given on the command line when starting the JVM, or via jcmd on an already running JVM.

      We enhance default.jfc and profile.jfc with two new options, method-timing and method-trace, which control the filter settings for the MethodTiming and MethodTrace events. The method-timing option counts the number of invocations and calculates the average execution time of methods that match the filter, while the method-trace option records stack traces for methods that match the filter.

      Additionally, we enhance the jfr view and jcmd <pid> JFR.view commands to display method timing and tracing results.

      To put this all together, if an application suffers from slow startup, timing the execution of all static initializers might suggest where lazy initialization could be used. We can time all static initializers in all classes by omitting the class name, specifying ::<clinit> as the filter:

      $ java '-XX:StartFlightRecording:method-timing=::<clinit>,filename=clinit.jfr' ...
      $ jfr view method-timing clinit.jfr
      
                                       Method Timing
      
      Timed Method                                           Invocations Average Time
      ------------------------------------------------------ ----------- ------------
      sun.font.HBShaper.<clinit>()                                     1 32.500000 ms
      java.awt.GraphicsEnvironment$LocalGE.<clinit>()                  1 32.400000 ms
      java2d.DemoFonts.<clinit>()                                      1 21.200000 ms
      java.nio.file.TempFileHelper.<clinit>()                          1 17.100000 ms
      sun.security.util.SecurityProviderConstants.<clinit>()           1  9.860000 ms
      java.awt.Component.<clinit>()                                    1  9.120000 ms
      sun.font.SunFontManager.<clinit>()                               1  8.350000 ms
      sun.java2d.SurfaceData.<clinit>()                                1  8.300000 ms
      java.security.Security.<clinit>()                                1  8.020000 ms
      sun.security.util.KnownOIDs.<clinit>()                           1  7.550000 ms
      ...
      $ 
      

      Filtering on classes and annotations

      Besides a method, a filter can name a class, in which case all methods in the class are timed or traced.

      A filter can also name an annotation. This causes all methods bearing the annotation, and all methods in all classes bearing the annotation, to be timed or traced. For example, to see the number of times that a Jakarta REST endpoint is invoked, and measure the approximate execution time:

      $ jcmd <pid> JFR.start method-timing=@jakarta.ws.rs.GET

      You can specify multiple filters, separating them with semicolons. For example, we can find where file descriptors are leaking by tracing their creation and destruction and looking for events that do not match up:

      $ java '-XX:StartFlightRecording:filename=fd.jfr,method-trace=java.io.FileDescriptor::<init>;java.io.FileDescriptor::close' ...
      $ jfr view --cell-height 5 MethodTrace fd.jfr
      
                                                        Method Trace
      
      Start Time    Duration Event Thread     Stack Trace                                     Method
      ---------- ----------- ---------------- ----------------------------------------------- -------------------------------
      20:32:39   0.000542 ms AWT-EventQueue-0 java.io.FileInputStream.<init>(File)            java.io.FileDescriptor.<init>()
                                              sun.security.provider.FileInputStreamPool.ge...                                
                                              sun.security.provider.NativePRNG$RandomIO.<i...                                
                                              sun.security.provider.NativePRNG.initIO(...)                                   
                                              sun.security.provider.NativePRNG.<clinit>() 
      20:32:39   0.000334 ms AWT-EventQueue-0 java.io.FileInputStream.<init>(File)            java.io.FileDescriptor.<init>()
                                              java.io.FileInputStream.<init>(String)                                         
                                              sun.awt.FontConfiguration.readFontConfigFile...                                
                                              sun.awt.FontConfiguration.init()                                               
                                              sun.font.CFontManager.createFontConfiguration()                                
      20:32:39     0.0166 ms AWT-EventQueue-0 java.io.FileInputStream$1.close()               java.io.FileDescriptor.close() 
                                              java.io.FileDescriptor.closeAll(Closeable)                                     
                                              java.io.FileInputStream.close()                                                
                                              sun.awt.FontConfiguration.readFontConfigFile...                                
                                              sun.awt.FontConfiguration.init()                                               
      ...
      $ 
      

      The --cell-height option sets the maximum number of rows to display for an event. In this example, it sets the maximum number of frames to show for each stack trace.

      Filter grammar

      Overall, the grammar for filters is:

      filter ::= target (";" target)*
      target ::= class | class-method | method | annotation
      class ::= identifier ("." identifier)*
      class-method ::= class method
      method ::= "::" method-name
      method-name ::= identifier | "<clinit>" | "<init>"
      annotation ::= "@" class
      identifier ::= see JLS §3.8
      

      A filter with the method name <init> matches all of the specified class's constructors, just as <clinit> matches the class's static initializer.

      A filter will match multiple methods when the specified method name is overloaded, in which case JFR instruments every method of that name in the specified class.

      Custom configuration files

      If you need to time or trace more than a few methods, you can create a separate configuration file to name them all:

      <?xml version="1.0" encoding="UTF-8"?>
      <configuration version="2.0">
        <event name="jdk.MethodTiming">
          <setting name="enabled">true</setting>
          <setting name="filter">
            com.example.Foo::method1;
            com.example.Bar::method2;
            ...
            com.example.Baz::method17
          </setting>
        </event>
      </configuration>

      You can then use this in conjunction with the default configuration:

      $ java -XX:StartFlightRecording:settings=timing.jfc,settings=default ...

      Custom annotations

      It can be convenient to define custom annotations with which to label methods or classes for future investigation. For example:

      package com.example;
      
      @Retention(RUNTIME)
      @Target({ TYPE, METHOD })
      public @interface StopWatch {
      }
      
      @Retention(RUNTIME)
      @Target({ TYPE, METHOD })
      public @interface Initialization {
      }
      
      @Retention(RUNTIME)
      @Target({ TYPE, METHOD })
      public @interface Debug {
      }

      When you need to troubleshoot the application, you can specify the appropriate annotations for timing or tracing:

      $ java -XX:StartFlightRecording:method-trace=@com.example.Debug ...
      $ java -XX:StartFlightRecording:method-timing=@com.example.Initialization,@com.example.StopWatch ...
      

      You can correlate the resulting MethodTiming and MethodTrace events with other events that JFR emits, such as events for lock contention, method sampling, or I/O, to determine why the annotated parts of the program were slow.

      Remote configuration

      You can use JMX and the JFR [<code class="prettyprint" >RemoteRecordingStream</code>] class to configure timing and tracing over the network. For example:

      import javax.management.remote.*;
      import jdk.management.jfr.*;
      
      /// Establish non-secure connection to remote host
      var url = "service:jmx:rmi:///jndi/rmi://example.com:7091/jmxrmi";
      var jmxURL = new JMXServiceURL(url);
      try (var conn = JMXConnectorFactory.connect(jmxURL)) {
          try (var r = new RemoteRecordingStream(conn.getMBeanServerConnection())) {
      
              // Create map for settings
              var settings = new HashMap<String,String>();
      
              // Trace methods in class com.foo.Bar that take more than 1 ms to execute
              settings.put("jdk.MethodTrace#enabled", "true");
              settings.put("jdk.MethodTrace#stackTrace", "true");
              settings.put("jdk.MethodTrace#threshold", "1 ms");
              settings.put("jdk.MethodTrace#filter", "com.foo.Bar");
      
              // Subscribe to trace events
              r.onEvent("jdk.MethodTrace", event -> ...);
      
              // Measure execution time and invocation count for methods with
              // the jakarta.ws.rs.GET annotation, and emit the result every 10 s
              settings.put("jdk.MethodTiming#enabled", "true");
              settings.put("jdk.MethodTiming#filter", "@jakarta.ws.rs.GET");
              settings.put("jdk.MethodTiming#period", "10 s");
      
              // Subscribe to timing events
              r.onEvent("jdk.MethodTiming", event -> ...);
      
              // Set the settings, then time and trace for 60 seconds
              r.setSettings(settings);
              r.startAsync();
              Thread.sleep(60_000);
              r.stop();
      
          }
      }

      When the stream is started, the remote JVM instruments the specified methods by injecting bytecode to emit MethodTiming and MethodTrace events, as appropriate. When the stream is stopped, after 60 seconds, the injected bytecode is removed. Then, after waiting for any pending events to be consumed, the stream is closed and the remote JVM discards its recording data, restoring the system to its original state.

      You can also use the RemoteRecordingStream class to consume MethodTiming and MethodTrace events within a process in a similar manner, for example in a Java agent.

      If multiple recordings run at the same time, whether started remotely or locally, JFR applies the union of all filters.

      Event details

      A jdk.MethodTiming event has these fields:

      • method — the name of the method that was timed,
      • startTime — the wall-clock time when the event was emitted,
      • invocations — the number of times the method was invoked,
      • average — the average wall-clock time it took for the method to execute.

      A jdk.MethodTrace event has these fields:

      • method — the name of the method that was traced,
      • startTime — the wall-clock time when the method was entered,
      • duration — the wall-clock time it took to execute the method,
      • eventThread — the thread that the method executed in, and
      • stackTrace — a stack trace, starting from the traced method’s caller.

      The precision of the average and duration fields depends upon the time it takes to obtain a timestamp, which in turn depends upon the hardware and operating system.

      The duration recorded for a method includes the time spent executing any methods that it invokes.

      Future Work

      Sometimes, a class to be timed or traced implements a known interface, but the name of the class itself is unknown. In such cases it would be convenient to be able to specify the interface in a filter, causing every class implementing that interface to be timed or traced. We could add such functionality in the future, without altering this design.

      Alternatives

      • JFR could record method arguments and non-static fields when timing or tracing. That would, however, enable it to be used as an attack vector for exfiltrating sensitive information, either programmatically by an unvetted third-party library or through a maliciously crafted configuration file.

      • The filter grammar could allow specific method-name overloads to be specified. That would, however, complicate the syntax since comma-separated parameter names would conflict with the commas that separate JFR options.

      • The filter grammar could accept wildcards, but that could lead to excessive numbers of instrumented methods, bringing the application to a halt.

      • To prevent excessive numbers of instrumented methods, JFR could limit the number of methods that can be instrumented. There are, however, scenarios where that can be acceptable. For example, static initializers are only invoked once, so asking to time or trace all of them is not unreasonable.

      Risks and Assumptions

      Specifying a filter that includes JDK methods used by the injected instrumentation bytecode can lead to infinite recursion. JFR attempts to avoid this, but its mechanism for doing so is fragile. If you observe such recursion then please submit a bug report; in the mean time, you can avoid the recursion by removing JDK methods from your filter.

            egahlin Erik Gahlin
            egahlin Erik Gahlin
            Erik Gahlin Erik Gahlin
            Markus Grönlund, Vladimir Kozlov
            Vladimir Kozlov
            Votes:
            0 Vote for this issue
            Watchers:
            11 Start watching this issue

              Created:
              Updated: