Loading...

XML

Word

Printable

Type: Bug
Resolution: Unresolved
Priority: P3
Fix Version/s: 26
Affects Version/s: 16
Component/s: hotspot
Labels:

Subcomponent:
gc

On the jruby bug tracker there is a bug report about later JDKs 20% slower than latest (e.g. JDK 14). (https://github.com/jruby/jruby/issues/5789 via https://twitter.com/headius/status/1297992914832769024).

The main reason is the change of the default GC in JDK9; however the difference is abnormally high so reporting it here. The typical observed difference for known outliers is around 10%.

After some tuning, i.e. setting -Xms == -Xmx, using 32M regions, the difference can be tuned a bit to ~13-15% difference.

One suspicion are the barriers as reported by [~shade] (in that bug report):

"Tested with recent JDK 13 EA and multiple collectors. Judging from GC logs, it is heavily-allocating, but fairly young-gc workload. Both Parallel and G1 run very short Young GCs during the run, taking about 1% of total time, which means allocation pressure itself is not the issue here."

Local results:
# score [% of options
baseline]
1 parallel 17,26 100,0% -Xmx1500m (oob)
2 g1 13,64 79,0% -Xmx1500m (oob)

3 parallel 17,16 100,0% -Xmx1500m -Xms1500m -Xmn1000m
4 g1 13,99 81,5% -Xmx1500m -Xms1500m -Xmn1000m
5 g1 14,36 83,7% -Xmx1500m -Xms1500m -Xmn1000m (rerun)
6 g1 15,13 88,2% -Xmx1500m -Xms1500m -Xmn1000m -XX:G1HeapRegionSize=32m
7 g1 14,90 86,8% -Xmx1500m -Xms1500m -Xmn1000m -XX:G1HeapRegionSize=32m (rerun)

8 parallel 13,81 100,0% graal -Xmx1500m -Xms1500m -Xmn1000m -XX:G1HeapRegionSize=32m
9 g1 13,11 94,9% graal -Xmx1500m -Xms1500m -Xmn1000m -XX:G1HeapRegionSize=32m

The interesting runs are 8 and 9, with graal. Seems like it's slower overall, but it also does not show a big difference (5%) in performance. So potentially there is an issue with C2 optimizations that only kicks in with Parallel GC's (small) barriers.

Some initial playing with -XX:MaxInlineSize and -XX:FreqInlineSize did not yield interesting results.

Reproduction:
* Download JRuby from https://www.jruby.org/download
* Clone https://github.com/PragTob/rubykon
* Run jruby -Xcompile.invokedynamic=true -J-Xmx1500m benchmark/mcts_avg.rb

JRuby will pick up the VM pointed to by JAVA_HOME; you can check which with "jruby -v".

relates to

JDK-8132937 G1 compares badly to Parallel GC on throughput on javac benchmark

Open

JDK-8133055 Investigate G1 performance on SPL4

Closed

JDK-8226197 Reduce G1’s CPU cost with simplified write post-barrier and disabling concurrent refinement

Closed

JDK-8226731 Remove StoreLoad in G1 post barrier

Closed

JDK-8340827 G1: Improve Application Throughput with a More Efficient Write-Barrier

Submitted

Assignee:: Unassigned

Reporter:: Thomas Schatzl

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Created:: 2020-09-16 03:26

Updated:: 2025-05-20 07:39

Details

Description

Attachments

Issue Links

Activity

People

Dates