FileMapInfo::patch_archived_heap_embedded_pointers() may be called during VM bootstrap. This happens, for example, if the heap size has changed significantly between CDS dump time and run time.
http://hg.openjdk.java.net/jdk/jdk/file/5ac19bd3a1e2/src/hotspot/share/memory/filemap.cpp#l1885
The default CDS archive is created with -Xmx128m to optimize for apps with small heaps (e.g., those used in cloud). However, if we run with a bigger heap (somewhat larger than 2GB), we're likely to use 0-based 3-bit shift oop compression, and thus the archived heap must be patched.
Because the patching is fairly independent, and the patched contents are not needed until the VM loads the first class, it should be safe to do the patching in a GC worker thread while the main VM thread continues with other initialization (such as interpreter generation).
============== about 2.64% degradation or 1.09ms
# No patching (dumptime heap = runtime heap)
$ java -Xshare:dump -Xmx4g
$ perf stat -r 100 java -Xmx4g -version
Performance counter stats for '/jdk/bld/bench/eva/loader_con0/bin/java -Xmx4g -version' (100 runs):
56.11 msec task-clock # 1.360 CPUs utilized ( +- 0.60% )
199 context-switches # 0.004 M/sec ( +- 1.04% )
12 cpu-migrations # 0.206 K/sec ( +- 2.08% )
2,937 page-faults # 0.052 M/sec ( +- 0.04% )
137,492,599 cycles # 2.451 GHz ( +- 0.15% )
85,290,084 stalled-cycles-frontend # 62.03% frontend cycles idle ( +- 0.18% )
66,524,836 stalled-cycles-backend # 48.38% backend cycles idle ( +- 0.21% )
105,195,769 instructions # 0.77 insn per cycle
# 0.81 stalled cycles per insn ( +- 0.14% )
20,009,439 branches # 356.625 M/sec ( +- 0.15% )
1,005,784 branch-misses # 5.03% of all branches ( +- 0.19% )
0.041249 +- 0.000282 seconds time elapsed ( +- 0.68% )
# With patching (dumptime heap != runtime heap)
$ java -Xshare:dump -Xmx128m
$ perf stat -r 100 java -Xmx4g -version
Performance counter stats for '/jdk/bld/bench/eva/loader_con0/bin/java -Xmx4g -version' (100 runs):
56.96 msec task-clock # 1.345 CPUs utilized ( +- 0.55% )
202 context-switches # 0.004 M/sec ( +- 0.97% )
12 cpu-migrations # 0.212 K/sec ( +- 1.90% )
3,118 page-faults # 0.055 M/sec ( +- 0.04% )
139,782,303 cycles # 2.454 GHz ( +- 0.14% )
86,193,560 stalled-cycles-frontend # 61.66% frontend cycles idle ( +- 0.15% )
67,261,311 stalled-cycles-backend # 48.12% backend cycles idle ( +- 0.17% )
107,975,073 instructions # 0.77 insn per cycle
# 0.80 stalled cycles per insn ( +- 0.14% )
20,706,343 branches # 363.509 M/sec ( +- 0.15% )
1,023,199 branch-misses # 4.94% of all branches ( +- 0.19% )
0.042338 +- 0.000274 seconds time elapsed ( +- 0.65% )
http://hg.openjdk.java.net/jdk/jdk/file/5ac19bd3a1e2/src/hotspot/share/memory/filemap.cpp#l1885
The default CDS archive is created with -Xmx128m to optimize for apps with small heaps (e.g., those used in cloud). However, if we run with a bigger heap (somewhat larger than 2GB), we're likely to use 0-based 3-bit shift oop compression, and thus the archived heap must be patched.
Because the patching is fairly independent, and the patched contents are not needed until the VM loads the first class, it should be safe to do the patching in a GC worker thread while the main VM thread continues with other initialization (such as interpreter generation).
============== about 2.64% degradation or 1.09ms
# No patching (dumptime heap = runtime heap)
$ java -Xshare:dump -Xmx4g
$ perf stat -r 100 java -Xmx4g -version
Performance counter stats for '/jdk/bld/bench/eva/loader_con0/bin/java -Xmx4g -version' (100 runs):
56.11 msec task-clock # 1.360 CPUs utilized ( +- 0.60% )
199 context-switches # 0.004 M/sec ( +- 1.04% )
12 cpu-migrations # 0.206 K/sec ( +- 2.08% )
2,937 page-faults # 0.052 M/sec ( +- 0.04% )
137,492,599 cycles # 2.451 GHz ( +- 0.15% )
85,290,084 stalled-cycles-frontend # 62.03% frontend cycles idle ( +- 0.18% )
66,524,836 stalled-cycles-backend # 48.38% backend cycles idle ( +- 0.21% )
105,195,769 instructions # 0.77 insn per cycle
# 0.81 stalled cycles per insn ( +- 0.14% )
20,009,439 branches # 356.625 M/sec ( +- 0.15% )
1,005,784 branch-misses # 5.03% of all branches ( +- 0.19% )
0.041249 +- 0.000282 seconds time elapsed ( +- 0.68% )
# With patching (dumptime heap != runtime heap)
$ java -Xshare:dump -Xmx128m
$ perf stat -r 100 java -Xmx4g -version
Performance counter stats for '/jdk/bld/bench/eva/loader_con0/bin/java -Xmx4g -version' (100 runs):
56.96 msec task-clock # 1.345 CPUs utilized ( +- 0.55% )
202 context-switches # 0.004 M/sec ( +- 0.97% )
12 cpu-migrations # 0.212 K/sec ( +- 1.90% )
3,118 page-faults # 0.055 M/sec ( +- 0.04% )
139,782,303 cycles # 2.454 GHz ( +- 0.14% )
86,193,560 stalled-cycles-frontend # 61.66% frontend cycles idle ( +- 0.15% )
67,261,311 stalled-cycles-backend # 48.12% backend cycles idle ( +- 0.17% )
107,975,073 instructions # 0.77 insn per cycle
# 0.80 stalled cycles per insn ( +- 0.14% )
20,706,343 branches # 363.509 M/sec ( +- 0.15% )
1,023,199 branch-misses # 4.94% of all branches ( +- 0.19% )
0.042338 +- 0.000274 seconds time elapsed ( +- 0.65% )
- duplicates
-
JDK-8326035 AOT Class Loading & Linking with Multiple GCs
-
- Submitted
-