Improve the implementation of method handles by replacing assembly language paths with an optimizable intermediate representation and then refactoring the implementation so that more work is done in portable Java code than is hardwired into the JVM.
- Improve performance, quality, and portability of method handles and
- Reduce the amount of assembly code in the JVM.
- Reduce the frequency of native calls and other complex transitions of control during method handle processing.
- Increase the leverage on JSR 292 performance of existing JVM optimization frameworks.
- Remove low-leverage or complex structures from the JVM that serve JSR 292 only. (E.g., remove the pattern-matching "method handle walk" phase.)
- Complete compatibility with the Java SE 7 specification for JSR 292.
- A better reference implementation of JSR 292.
This work is intended to be a foundation for future optimization work, to be carried out separately in the JVM and in the Java code. Therefore, this project will succeed with modest improvements in performance and stability. Large performance gains are not required, as long as the system is clearly simpler, better factored, and easier to optimize. This in turn will be clear if, after simplifying the Java 7 code base, performance is not harmed.
The JDK 7 implementation of JSR 292 (method handles and
relies on large amounts of hand-written assembly code to perform method
handle argument transforms. Optimized native code is obtained by a
separate module which performs a pattern-matching walk on method handle
graphs and converts them (inside the JVM JIT) into intermediate
representation (Java bytecodes, and then C1 or C2 IR).
This architecture is adequate for many uses, but suffers from two flaws.
First, invocations of non-constant method handles cannot be optimized,
because there the pattern-matching conversion to IR happens only at call
sites (such as
invokedynamic). Such invocations must copy the argument
list from compiled to interpreted format, and then execute the
hand-written assembly code for argument transformation, causing data
motion that is excessive and unoptimized. Customers experience this as
a "performance cliff".
Second, because the interpreted and compiled versions of method handles
use different execution engines (assembly code vs. IR generated from
pattern matches), and because the translation between representations is
imperfect, compiled method handles do not always behave the same as
interpreted ones. In particular there is an intermittent
NoClassDefFoundError caused when translated method handles grow too
large and their bytecodes contain too many symbolic references.
These flaws can be removed by removing the assembly code, and replacing it by an intermediate representation used for both interpretation and compilation.
As a secondary effect, removing the assembly code will make it easier to port the JVM to additional platforms.
Non-constant method handle invocation will become more frequent in the future, since method handles are part of the infrastructure for Java "lambdas" which are coming in Java SE 8. In general, as JSR 292 is adopted more widely, it must become more robustly performant, across a wider range of use cases.
Make a new intermediate representation, called lambda form, for method
handles that is (a) directly executable and (b) directly and simply
reducible to bytecodes and/or JIT IR. Implement all method handle
invokedynamic call sites, using lambda forms.
Remove all assembly code from
methodHandles_<arch>.cpp, except a small
number of assembly instructions (about 100) for sub-primitives used by
the lambda forms. (By contrast, JDK 7 generates method handle stubs,
for numerous user-visible argument transforms, containing about 7000
Move implementation logic specific to method handles from the JVM up into Java code, when possible. Rely on JITs to perform vigorous optimization of lambda forms (or their corresponding bytecodes) and their sub-primitives.
A lambda form represents a series of weakly typed formal parameters followed by a linear, non-branching series of method call expressions. Each expression consists of a method (specified as an arbitrary constant method handle) with associated arguments. An argument can be an arbitrary constant or a reference to a preceding parameter or expression within the lambda form.
Sub-primitives consist of low-level adapters for raw method handle invocation and for emulating each of the four invocation modes (invokevirtual, etc.).
A lambda form is compilable at any time into compact Java bytecodes and passed to the JVM for dynamic loading. Lambda forms will contain their own invocation counters which will allow them to delay compilation until they are "hot" enough. Future versions of the JVM may be able to execute and/or compile and/or profile lambda forms directly, leading to further variations of the theme of mixed-mode execution and optimization.
In order to maximize reuse of lambda forms and their compiled code, the type system for lambda form expressions is weakened to five so-called basic types: reference, int, long, float, and double. This means they can only be created by trusted Java code. Explicit casts and other checks preserve type safety at all entry points accessible to the user.
As a related optimization made practical by the new framework, bound method handles will be "flattened" into small structs containing their bound values, with little or no boxing. These structs will be composed and loaded as needed.
The bytecode generation framework for lambda forms and small data-carrying structs is based on the ASM library Although designed to create method handles, it can be readily extended to create other types of objects. Therefore, this work will provide a likely basis for efficient representation of the functional "SAM" objects required by Project Lambda, and perhaps other future constructs, such as tuple objects or hybrid arrays.
Additional design notes are kept in the MLVM repository.
We could keep the assembly code and attempt to refine the existing pattern matcher and adapt it to compile stand-alone method handles (in the absence of call sites).
The disadvantage would be increased complexity in the JVM JITs, and (likely) more rapidly diminishing returns on optimization work. Maintenance of hand-written assembler for all target platforms would add larger per-platform costs, for both new and existing platforms. Special stack frame types (so-called ricochet frames) would perhaps increase the probability of bugs in modules which must walk the JVM stack.
Existing tests with unit tests (jtreg-based) and "big applications" will continue. The test coverage will be incrementally improved.
Customer-derived benchmarks will be used to detect performance improvements or regressions.
This will be a coordinated change in the JVM and JDK Java code bases. (The Java code changes are confined to the java.lang.invoke and sun.invoke packages.) The changes must be deployed together.
It appears impractical to build in cross-revision support (old JVM and new JDK and/or old JDK and new JVM). This means that platform-specific assembly code must be ported before a platform can run JDK 8.
We expect to back-port this code to JDK 7. This means that we need to complete the main body of the work (such as JVM refactorings) before any other pervasive changes, such as large scale [metadata changes (JEP 147)|147] or [metadata relocations (JEP 122)|122], take place in the JDK 8 code base.