I have been doing some startup time research for Leyden and different GCs, and noticed that ZGC is orders of magnitude worse on startup tests. Look:
$ cat Hello.java
public class Hello {
public static void main(String... args) throws Throwable {
System.out.println("Hello world!");
}
}
$ javac Hello.java
$ hyperfine -w 10 -r 10 "build/linux-x86_64-server-release/images/jdk/bin/java -XX:+UseG1GC -Xmx1g Hello"
Benchmark 1: build/linux-x86_64-server-release/images/jdk/bin/java -XX:+UseG1GC -Xmx1g Hello
Time (mean ± σ): 32.5 ms ± 0.6 ms [User: 15.8 ms, System: 24.5 ms]
Range (min ... max): 31.7 ms ... 33.6 ms 10 runs
$ hyperfine -w 10 -r 10 "build/linux-x86_64-server-release/images/jdk/bin/java -XX:+UseZGC -Xmx1g Hello"
Benchmark 1: build/linux-x86_64-server-release/images/jdk/bin/java -XX:+UseZGC -Xmx1g Hello
Time (mean ± σ): 604.2 ms ± 5.6 ms [User: 32.5 ms, System: 590.7 ms]
Range (min ... max): 596.9 ms ... 614.4 ms 10 runs
$ hyperfine -w 10 -r 10 "build/linux-x86_64-server-release/images/jdk/bin/java -XX:+UseZGC -Xmx10g Hello"
Benchmark 1: build/linux-x86_64-server-release/images/jdk/bin/java -XX:+UseZGC -Xmx10g Hello
Time (mean ± σ): 1.151 s ± 0.005 s [User: 0.033 s, System: 1.137 s]
Range (min ... max): 1.144 s ... 1.158 s 10 runs
Quick profiling shows ZGC spends time wiring up memory:
- 68.24% 0.00% java libc.so.6 [.] start_thread
- 68.24% start_thread
- 68.24% ThreadJavaMain
- JavaMain
- 68.22% JNI_CreateJavaVM
- 68.21% Threads::create_vm(JavaVMInitArgs*, bool*)
- 67.98% init_globals()
- 67.97% universe_init()
- 67.93% ZArguments::create_heap()
- 67.93% ZCollectedHeap::ZCollectedHeap()
- 67.93% ZHeap::ZHeap()
- 67.92% ZPageAllocator::prime_cache(ZWorkers*, unsigned long)
- ZPageAllocator::alloc_page(ZPageType, unsigned long, ZAllocationFlags, ZP
- 67.92% ZPageAllocator::alloc_page_finalize(ZPageAllocation*)
- 67.92% ZPhysicalMemoryManager::commit(ZPhysicalMemory&)
ZPhysicalMemoryBacking::commit(zoffset, unsigned long)
ZPhysicalMemoryBacking::commit_inner(zoffset, unsigned long)
ZPhysicalMemoryBacking::fallocate_fill_hole(zoffset, unsigned
- syscall
- 67.91% entry_SYSCALL_64_after_hwframe
Of course lower -Xms helps:
$ hyperfine -w 10 -r 10 "build/linux-x86_64-server-release/images/jdk/bin/java -XX:+UseZGC -Xms128m -Xmx10g Hello"
Benchmark 1: build/linux-x86_64-server-release/images/jdk/bin/java -XX:+UseZGC -Xms128m -Xmx10g Hello
Time (mean ± σ): 108.0 ms ± 1.5 ms [User: 32.6 ms, System: 95.6 ms]
Range (min … max): 105.2 ms … 110.4 ms 10 runs
While we should really consider to trim the default initial heap size (JDK-8348278), it would not help if user specifies Xms. So, it would be great to see if we can improve this code path across large initial heap sizes. Maybe doing these commits in parallel / asynchronously?
$ cat Hello.java
public class Hello {
public static void main(String... args) throws Throwable {
System.out.println("Hello world!");
}
}
$ javac Hello.java
$ hyperfine -w 10 -r 10 "build/linux-x86_64-server-release/images/jdk/bin/java -XX:+UseG1GC -Xmx1g Hello"
Benchmark 1: build/linux-x86_64-server-release/images/jdk/bin/java -XX:+UseG1GC -Xmx1g Hello
Time (mean ± σ): 32.5 ms ± 0.6 ms [User: 15.8 ms, System: 24.5 ms]
Range (min ... max): 31.7 ms ... 33.6 ms 10 runs
$ hyperfine -w 10 -r 10 "build/linux-x86_64-server-release/images/jdk/bin/java -XX:+UseZGC -Xmx1g Hello"
Benchmark 1: build/linux-x86_64-server-release/images/jdk/bin/java -XX:+UseZGC -Xmx1g Hello
Time (mean ± σ): 604.2 ms ± 5.6 ms [User: 32.5 ms, System: 590.7 ms]
Range (min ... max): 596.9 ms ... 614.4 ms 10 runs
$ hyperfine -w 10 -r 10 "build/linux-x86_64-server-release/images/jdk/bin/java -XX:+UseZGC -Xmx10g Hello"
Benchmark 1: build/linux-x86_64-server-release/images/jdk/bin/java -XX:+UseZGC -Xmx10g Hello
Time (mean ± σ): 1.151 s ± 0.005 s [User: 0.033 s, System: 1.137 s]
Range (min ... max): 1.144 s ... 1.158 s 10 runs
Quick profiling shows ZGC spends time wiring up memory:
- 68.24% 0.00% java libc.so.6 [.] start_thread
- 68.24% start_thread
- 68.24% ThreadJavaMain
- JavaMain
- 68.22% JNI_CreateJavaVM
- 68.21% Threads::create_vm(JavaVMInitArgs*, bool*)
- 67.98% init_globals()
- 67.97% universe_init()
- 67.93% ZArguments::create_heap()
- 67.93% ZCollectedHeap::ZCollectedHeap()
- 67.93% ZHeap::ZHeap()
- 67.92% ZPageAllocator::prime_cache(ZWorkers*, unsigned long)
- ZPageAllocator::alloc_page(ZPageType, unsigned long, ZAllocationFlags, ZP
- 67.92% ZPageAllocator::alloc_page_finalize(ZPageAllocation*)
- 67.92% ZPhysicalMemoryManager::commit(ZPhysicalMemory&)
ZPhysicalMemoryBacking::commit(zoffset, unsigned long)
ZPhysicalMemoryBacking::commit_inner(zoffset, unsigned long)
ZPhysicalMemoryBacking::fallocate_fill_hole(zoffset, unsigned
- syscall
- 67.91% entry_SYSCALL_64_after_hwframe
Of course lower -Xms helps:
$ hyperfine -w 10 -r 10 "build/linux-x86_64-server-release/images/jdk/bin/java -XX:+UseZGC -Xms128m -Xmx10g Hello"
Benchmark 1: build/linux-x86_64-server-release/images/jdk/bin/java -XX:+UseZGC -Xms128m -Xmx10g Hello
Time (mean ± σ): 108.0 ms ± 1.5 ms [User: 32.6 ms, System: 95.6 ms]
Range (min … max): 105.2 ms … 110.4 ms 10 runs
While we should really consider to trim the default initial heap size (JDK-8348278), it would not help if user specifies Xms. So, it would be great to see if we can improve this code path across large initial heap sizes. Maybe doing these commits in parallel / asynchronously?
- relates to
-
JDK-8348278 Trim InitialRAMPercentage to improve startup in default modes
-
- Open
-