Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-7164100

Throughput collector shows performance regression vs jrockit

    XMLWordPrintable

Details

    • Enhancement
    • Resolution: Fixed
    • P3
    • 9
    • 7u4
    • hotspot
    • None
    • gc
    • generic
    • linux

    Description

      With equal-sized heaps, most fusion apps are showing a significant performance regression using JDK 7u4 vs jrockit and the default collectors in each VM. While using CMS and G1 can improve the situation for JDK 7u4, the performance requirement is that the (default) throughput collector perform as well as jrockit's default collector. Hence, we are focused here only on the performance of the parallel collectors.

      Here is a sampling of data points:
      For ADFCRMdemo, jrockit will give an average response time of 0.042; hotspot gives an average response time of 0.064. During the steady state logs for GC, we see this for hotspot:

      Minor GC: 26776.82899999998 in 383 collections; avg 69.91339164490856
      Full GC: 28875.450000000004 in 8 collections; avg 3609.4312500000005
      Average heap after GC: 1539364.75

      For jrockit:
      Minor GC: 14674.098000000004 in 167 collections; avg 87.86885029940122
      Full GC: 8655.18 in 19 collections; avg 455.53578947368425
      Average heap after GC: 1811460.5263157894

      One difference here is the size of Eden/Young Generations. In hotspot, we are explicitly setting that to Xmn512m (more on that in a bit); in jrockit we let the system determine the size. It would seem that jrockit has selected a bigger eden to do the fewer GCs.

      Similarly, for ATG CRMdemo, hotspot gives us a 0.251 response time compared to jrockit's 0.228 second. In this case, the total time in GC for hotspot is actually less than jrockit:
      HS Minor GC: 77.501 sec in 987 collections
      HS Full GC: 80.341 sec in 30 collections
      JR Minor GC: 66.264 sec in 975 collections
      JR Full GC: 136.095 sec in 224 collections

      The problem here is the really long time that the individual HS full GCs take -- they completely throw the average response times out of whack (in fact, the 90%th response times are less than the average response times, and less than the jrockit 90%th response times). Which is also why we see much better results with CMS and G1GC...

      Although these data points indicate that we would probably be better off with a larger eden and smaller old gen (because, as in the case of jrockit, we can tolerate more old GCs if they just last less long to throw off the average), testing has not borne that out, and hence the best results we get are with a 512m new size in hotspot (unlike jrockit). When we remove the Xmn argument, we will see (for example) a 37% increase in the number of old GC, and while we see only a 22% increase the the total time in old GC, they end up being too frequent. We are continuing to test along those lines, though; perhaps some survivor ratio or other tuning can get us to where the old gc times are not such an issue.

      I wonder if the fully-compacting Hotspot vs. not-fully-compacting jrockit full GC issue is the key here? Or are there other possible explanations?

      Attachments

        Issue Links

          Activity

            People

              jcoomes John Coomes (Inactive)
              soaks Scott Oaks
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:
                Imported:
                Indexed: