When linking a class, InstanceKlass::link_class_impl() first links all super classes and super interfaces of the current class. For the current class, it then verifies and rewrites the bytecode, links methods, initializes the itable and vtable, and sets the current class to 'linked' state.
When loading an archived class at runtime, SystemDictionary::load_shared_class makes sure the super types (all super classes and super interfaces) in the class hierarchy are loaded first. If not, the archived class is not used. The archived class is restored when 'loading' from the archive. At the end of the restoration, all methods are linked. As bytecode verification and rewriting are done at CDS dump time, runtime does not redo the operations for an archived class.
If we make sure the itable and vtable are properly initialized (not needed for classes loaded by the NULL class loader) and SystemDictionaryShared::check_verification_constraints is performed for an archived class during restoration, then the archived class (from builtin loaders) is effectively in 'linked' state.
For all archived classes loaded by the builtin loaders, we can safely set the archived class to 'linked' state at the end of restoration. As a result, we can save the work for iterating the super types in InstanceKlass::link_class_impl() in those cases.
Here is the 'before' and 'after' comparison when running HelloWorld (1000 runs) for the change on top of JDK 11:
before
---------
Performance counter stats for 'bin/java -cp hw.jar -Xshare:auto HelloWorld' (1000 runs):
69.99 msec task-clock:u # 1.125 CPUs utilized ( +- 0.33% )
0 context-switches:u # 0.000 K/sec
0 cpu-migrations:u # 0.000 K/sec
4,488 page-faults:u # 64596.865 M/sec ( +- 0.03% )
90,755,863 cycles:u # 1306159.241 GHz ( +- 0.05% )
96,734,939 instructions:u # 1.07 insn per cycle ( +- 0.01% )
17,956,529 branches:u # 258430532.850 M/sec ( +- 0.01% )
573,094 branch-misses:u # 3.19% of all branches ( +- 0.09% )
0.062232 +- 0.000249 seconds time elapsed ( +- 0.40% )
after
------
Performance counter stats for 'bin/java -cp hw.jar -Xshare:auto HelloWorld' (1000 runs):
69.61 msec task-clock:u # 1.125 CPUs utilized ( +- 0.34% )
0 context-switches:u # 0.000 K/sec
0 cpu-migrations:u # 0.000 K/sec
4,489 page-faults:u # 64941.193 M/sec ( +- 0.03% )
89,888,015 cycles:u # 1300369.120 GHz ( +- 0.03% )
95,082,578 instructions:u # 1.06 insn per cycle ( +- 0.01% )
17,580,311 branches:u # 254326380.673 M/sec ( +- 0.01% )
568,132 branch-misses:u # 3.23% of all branches ( +- 0.02% )
0.061886 +- 0.000251 seconds time elapsed ( +- 0.41% )
It saves >1.5M instructions execution for HelloWorld. Perf is also showing saving with cpu cycles.
A more important motivation of this change is to lay a foundation for future optimizations that support pre-resolving constant pool references (which in-turn can help generate better optimized AOT code) and pre-initializing classes, and preserving those states at CDS dump time. As JVM spec requires the ordering of loading, verifying, linking/preparing, and initializing and we seek a solution that is spec complaint. Being able to place an archived class in 'linked' state during restoration would allow it to be placed in 'initialized' state at restore time for cases where it is suitable in the future. That would solve some of the prerequisites for pre-resolving CP references to fields and methods.
When loading an archived class at runtime, SystemDictionary::load_shared_class makes sure the super types (all super classes and super interfaces) in the class hierarchy are loaded first. If not, the archived class is not used. The archived class is restored when 'loading' from the archive. At the end of the restoration, all methods are linked. As bytecode verification and rewriting are done at CDS dump time, runtime does not redo the operations for an archived class.
If we make sure the itable and vtable are properly initialized (not needed for classes loaded by the NULL class loader) and SystemDictionaryShared::check_verification_constraints is performed for an archived class during restoration, then the archived class (from builtin loaders) is effectively in 'linked' state.
For all archived classes loaded by the builtin loaders, we can safely set the archived class to 'linked' state at the end of restoration. As a result, we can save the work for iterating the super types in InstanceKlass::link_class_impl() in those cases.
Here is the 'before' and 'after' comparison when running HelloWorld (1000 runs) for the change on top of JDK 11:
before
---------
Performance counter stats for 'bin/java -cp hw.jar -Xshare:auto HelloWorld' (1000 runs):
69.99 msec task-clock:u # 1.125 CPUs utilized ( +- 0.33% )
0 context-switches:u # 0.000 K/sec
0 cpu-migrations:u # 0.000 K/sec
4,488 page-faults:u # 64596.865 M/sec ( +- 0.03% )
90,755,863 cycles:u # 1306159.241 GHz ( +- 0.05% )
96,734,939 instructions:u # 1.07 insn per cycle ( +- 0.01% )
17,956,529 branches:u # 258430532.850 M/sec ( +- 0.01% )
573,094 branch-misses:u # 3.19% of all branches ( +- 0.09% )
0.062232 +- 0.000249 seconds time elapsed ( +- 0.40% )
after
------
Performance counter stats for 'bin/java -cp hw.jar -Xshare:auto HelloWorld' (1000 runs):
69.61 msec task-clock:u # 1.125 CPUs utilized ( +- 0.34% )
0 context-switches:u # 0.000 K/sec
0 cpu-migrations:u # 0.000 K/sec
4,489 page-faults:u # 64941.193 M/sec ( +- 0.03% )
89,888,015 cycles:u # 1300369.120 GHz ( +- 0.03% )
95,082,578 instructions:u # 1.06 insn per cycle ( +- 0.01% )
17,580,311 branches:u # 254326380.673 M/sec ( +- 0.01% )
568,132 branch-misses:u # 3.23% of all branches ( +- 0.02% )
0.061886 +- 0.000251 seconds time elapsed ( +- 0.41% )
It saves >1.5M instructions execution for HelloWorld. Perf is also showing saving with cpu cycles.
A more important motivation of this change is to lay a foundation for future optimizations that support pre-resolving constant pool references (which in-turn can help generate better optimized AOT code) and pre-initializing classes, and preserving those states at CDS dump time. As JVM spec requires the ordering of loading, verifying, linking/preparing, and initializing and we seek a solution that is spec complaint. Being able to place an archived class in 'linked' state during restoration would allow it to be placed in 'initialized' state at restore time for cases where it is suitable in the future. That would solve some of the prerequisites for pre-resolving CP references to fields and methods.
- blocks
-
JDK-8245858 Enhance Java heap object (subgraph) archiving for more general support of selective class/static field pre-initialization
- Closed
- csr for
-
JDK-8246289 Set state to 'linked' when an archived boot class is restored at runtime
- Closed
- relates to
-
JDK-8178349 Cache builtin class loader constraints to avoid re-initializing itable/vtable for shared classes
- Resolved
-
JDK-8233887 Archived class pre-resolution and pre-initialization
- Closed
-
JDK-8246015 Method::link_method is called twice for CDS methods
- Closed