Summary
Improve warm-up time by making profile data from a previous run of an application instantly available, when the HotSpot Java Virtual Machine starts. Specifically, enhance the AOT cache to store method execution profiles from training runs, reducing profiling delays in subsequent production runs.
Goals
Help applications warm up more quickly, by supplying precomputed method profiles so the JIT can begin to produce code immediately, when a production run starts.
Do not introduce new constraints on application execution.
Do not introduce new AOT workflows, rather extend the existing AOT cache creation command from JEP 483.
Non-Goals
- It is not a goal of this JEP to make any other improvement to JEP 483 besides adding profile information to the AOT cache.
Success Metrics
Measurable warm-up time improvements, due to shifting of execution profiling work.
Profile data in the AOT cache is observably used by the JIT, even in a production run which has new profiling activity disabled (for testing).
Motivation
A special strength of the Java Virtual Machine is its ability to respond flexibly and dynamically to unpredictable application behaviors. These behaviors are often statically unpredictable, and may even include dynamic loading of code not present during any static analysis.
But the Virtual Machine optimizes it all, because it can use the evidence of actual prior execution, rather than rely solely on a weakly predictive static model. This evidence is collected by a process called profiling.
A profile for some method is collected when the JVM counts invocations of it (and perhaps executions of its bytecodes), and makes a note of the dynamic types of operands (such as method receivers) to bytecodes within the method. This information is then used to predict the method’s future behavior, and to optimize compiled code for that expected behavior.
By using profiles, the Virtual Machine supports flexible application behaviors, with full optimization. This benefits many Java workloads.
Because profiling is done in the application itself, it is paid for by time spent gathering and using the dynamic profile information, as an application starts up and warms up. For an application that is run repeatedly, it would be preferable to gather such dynamic profile information once, in a “training run”, cache it, and then use it to accelerate many “production runs”.
AOT caches, as defined in JEP 483, provide the means for this caching, if only they can be enhanced to record the profiling information gathered in a training run. Today’s AOT caches already store a variety of data, such as loaded classes, linkage information, and even Java objects. Adding profile data into this mix is natural and desirable.
As a general point, more and more information about the past behavior of an application gives the JVM a better and better estimate of the future behavior. This in turn enables the JVM to focus more accurately on the compilation work that will improve application performance, and to avoid useless optimization of methods that are not likely to be called frequently. Thus, better profile data, including historical profile data in an AOT cache, will allow the JVM to quickly JIT-compile important code, and to defer the compilation of less important code.
Description
We extend the AOT cache (of JEP 483) to record execution profiles. These profiles are an old technology, generated by the HotSpot Virtual Machine for use by the JIT. What is new is that profile metadata will be stored in the AOT cache, much like the class metadata already recorded in the AOT cache.
AOT caches will normally include this profiling data. It may be suppressed by a additional command line option, -XX:-AOTMethodProfiling
(TBD).
Profiles cached from training runs will not prevent additional profiling during production runs. The application will also collect its own profile data, as usual. The JVM’s compilation policy will be tuned to use all available profile data. The net effect is that the JIT is able to run earlier, if profile information is available from a training run, since the production run does not have to reproduce such information from scratch. And yet the same peak performance can be reached, in the end, regardless of the dynamic behavior of the application during the production run.
Alternatives
Doing nothing continues the current profiling delays before the JIT can get to its useful work.
If an application is so predictable that AOT code can be generated for it, and that will allow it to reach peak performance without further JIT activity, then such code is preferable to caching profiles. Such AOT code will be provided by another JEP. However, many Java applications benefit from a mix of AOT compilation and JIT compilation, since their behavior cannot be accurately predicted by an AOT compiler.
Cached profiles and cached AOT code are thus not mutually antagonistic, and will synergize to provide best performance for a wide range of applications. A partial AOT solution, where reasonable AOT code is replaced by a delayed JIT, seems likely to be the best solution in the end. The delayed JIT can stay out of the application’s way, and take its time to get the final code just right, based on the latest profiling information.
Testing
We will create new unit test cases that cover specific behavior of this JEP.
We can use run existing AOT cache test cases with this option enabled. Such test cases should still pass.
Risks and Assumptions
There are no new risks, beyond those already inherent to the AOT technology as noted in JEP 483.
The base assumption of the AOT cache is operative: A training run is assumed to be a good source of observable decisions, such that, when they are passed through an AOT cache to a production run, will benefit the performance of that production run.
Dependencies
This JEP is an evolution of the existing AOT cache implementation. It depends on JEP 483. Future work in Project Leyden is likely to depend on it.
- relates to
-
JDK-8353598 Allow AOT cache to be used in training run
-
- Open
-
-
JDK-8335358 [premain] Explore alternative ways to trigger the end of training run
-
- Resolved
-