Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8257602

Introduce JFR Event Throttling and new jdk.ObjectAllocationSample event (enabled by default)



    • jfr
    • b28


      This enhancement tracks the work which is the result from a collaboration between DataDog and Oracle, building on the POC originally suggested by [~jbachorik] @DataDog:

      In JFR, today, there exist two memory allocation profiling events: jdk.ObjectAllocationInNewTLAB for allocations inside of thread local area buffers (TLABs) and jdk.ObjectAllocationOutsideTLAB for allocations outside. These events are quite useful for both allocation profiling and for TLAB tuning.
      They are, however, quite hard to reason about in terms of data production rates, and in certain edge cases, both the overhead and the data volume can be quite high. Because of this, the events are disabled in the default JFR configuration (default.jfc) and only enabled in the profiling JFR configuration (profile.jfc).

      Since object allocation is one of the most important aspects for understanding java application performance, and especially for always-on production time profiling, arguably the most important domain for JFR, these are quite serious drawbacks.
      It would be very convenient if information about object allocations could be turned on by default, out-of-the-box as it were, as it would provide information about allocation patterns for any java application.

      There are two reasons why it has not yet been possible to have the existing allocation events enabled by default:

      Huge, in-deterministic, number of events
      The sheer number of events recorded is a function of the allocation pressure in an application, and in general, java applications allocate a lot of objects. Although no match for the JFR engine to keep up, when enabled, these two event types attribute almost 75-80% of the entire set of events recorded. Compared to not having the events turned on, recording files on disk can easily increase 5-6x in size. These large files quickly become unwieldy, especially in situations where they are destined to be moved somewhere.

      Performance overhead (small)
      Although the event sites hook into the slow paths of object allocation in the JVM, in a regular java application, also the slow paths are very heavily trafficked. Understandably, this is a very critical path, and it is important that overhead is reduced to an absolute minimum. Since arguably the most important piece of profiling an allocation is a stacktrace pinpointing where it originated, capturing stacktraces are central to the events. Capturing stacktrace information is very fast in JFR, but still one of the most performance sensitive operations. It can quickly introduce unwanted overhead, both for the sheer number of frames to iterate as well as contention on hashtables as the concurrency increases. Normally this is a non-problem for other JFR event types, but for events that sit in critical paths, it is something that needs to be considered.

      JFR has historically had a problem with unregulated event data sets, because, up to this point, there has not existed an adequate, as in performant, reliable and representative, means to sub-sample, or throttle, the emission rate for instant events. This is one of the main reasons much care goes into deciding the parameters for the default configuration (i.e. default.jfc) - it will have to be universally acceptable, also in anomalous environments and situations.
      Granted, the concept of a threshold exists, reified as the threshold setting in configuration files, and it is acting a little bit like a throttle in that it limits the number of events recorded to only those above the threshold. Unfortunately, a threshold setting can only be configured for duration events, but maybe even more important, it will only record events considered to be outliers.

      We would like a general mechanism that can record subsets of any event type, of configurable sizes, one that not only give outliers, but is statistically representative.

      Introducing JFR Event Throttling

      Throttle event setting:


      A new configurable setting is made available in the .jfc files and the settings system in general. It takes expressions to evaluate for instructions on how to select a subset from the extension set of an event type. For now, only a single expression form is supported, one that express a rate (see example above). This expression states the number of events per time unit and JFR will, for this specific example, throttle the event emission rate to 100 per second, distributed evenly over time. Intuitively, this expression declares how many events per second we’re aiming for.
      Casually, we call this “throttling” and say the event type is “throttled” when it is configured with this setting. Note that it is very likely that the actual rate is much less than the target, simply because there are no or only a small number of events being generated in the system. More importantly, from the perspective of improved determinism and control, JFR will not produce a rate that is higher than the expression, no matter the overall event pressure, hence the expression acts as a maximum rate.

      All existing time units will be supported, e.g.: 2/ns, 5/us, 1/ms, 100/s, 600/m, 3600/h, 86400/d. Note that for the initial introduction, only unit times are supported. In the future, additional support can be added, if needed, to also support time coefficients, for example 600/5m.

      As part of this enhancement, only a single event, the new jdk.ObjectAllocationSample event will support the new throttle setting; applying it to other event types does nothing.

      JFR Adaptive Sampler:

      The implementation of the sampler in the JVM is key to enable this functionality. The adaptive sampler is highly performant and general enough so that additional specializations can be built moving forward. We add one specialized concept, the JFR Event Throttler, which is the component to evaluate sample set inclusion for event types configured with the <throttle> setting, with the JFR Adaptive Sampler providing the indicator function.

      Recording only a throttled / sub-sampled set of events is very useful, especially for events located in critical paths (allocations, locks and more), because it becomes possible to regulate and control both for size and overhead. The size of a subset can then be configured based on requirements, and also modified dynamically as the system is progressing. With this mechanism, it is possible to introduce a new allocation event that can be enabled by default (just because it is throttled).

      The jdk.ObjectAllocationSample event definition:

        <Event name="ObjectAllocationSample" category="Java Application" label="Allocation sample" description="Allocation sample" thread="true" stackTrace="true" startTime="false" throttle="true">
          <Field type="Class" name="objectClass" label="Object Class" description="Class of allocated object" />
          <Field type="ulong" contentType="bytes" name="weight" label="Sample Weight" description="An attribute to facilitate the relative comparison of samples, not necesarily the memory amount allocated by the sampled object" />

      Disabling jdk.ObjectAllocationInNewTLAB and jdk.ObjectAllocationOutsideTLAB in the profile.jfc:

      With the new allocation event enabled in both default.jfc and profile.jfc, we also take the opportunity to disable jdk.ObjectAllocationInNewTLAB and jdk.ObjectAllocationOutsideTLAB in the profile.jfc, since its inclusion has led to reports about very large recording files, hence working with the profile configuration can be cumbersome. The events are still available, but they will not be turned on by the configuration files shipped in the JDK.


        Issue Links



              mgronlun Markus Grönlund
              mgronlun Markus Grönlund
              0 Vote for this issue
              6 Start watching this issue