Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-4803222

1.4.1_01 JVM becomes VERY slow with the -XX:+UseConcMarkSweepGC

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Duplicate
    • Icon: P4 P4
    • None
    • 1.4.1
    • hotspot
    • gc
    • x86
    • linux



      Name: jl125535 Date: 01/14/2003


      FULL PRODUCT VERSION :
      java version "1.4.1_01"
      Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.1_01-b01)
      Java HotSpot(TM) Client VM (build 1.4.1_01-b01, mixed mode)


      FULL OPERATING SYSTEM VERSION :
      Redhat 7.3, glibc 2.2.5, kernel 2.4.18
       
      ADDITIONAL OPERATING SYSTEMS :


      EXTRA RELEVANT SYSTEM CONFIGURATION :
      This happens on an Dual CPU Intel Xeon 2.4GHz with
      Hyperthreading turned on. We have seen the problem in
      other machines also, but it is easiest to reproduce on a
      machine with Hyperthreading.

      A DESCRIPTION OF THE PROBLEM :
      When running multithreaded programs which create a lot of
      garbage with the -XX:+UseConcMarkSweepGC flag, the JVM
      locks up (becomes very very slow, but the program still
      seems to make progress). This happens most frequently with
      a Hyperthreaded Dual processor machine, but we have seen
      this in other machines also.

      STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
      1. Run the program with the following options:
      java -server -Xmn16M -Xms80M -Xmx80M
      -XX:+UseConcMarkSweepGC -verbose:gc -XX:+PrintGCDetails
      -XX:+PrintTenuringDistribution -XX:+PrintGCTimeStamps
      ConcurrentGCProblem
        
      2. Monitor the CPU usage using "vmstat 1"
      Wait for 1 or 2 CMS garbage collections to occur, and
      suddenly the CPU usage becomes very low (less than 5%).
         


      EXPECTED VERSUS ACTUAL BEHAVIOR :
      Wait for 1 or 2 CMS garbage collections to occur, and
      suddenly the CPU usage becomes very low (less than 5%).
      This happens only with the Concurrent GC turned on.

      ERROR MESSAGES/STACK TRACES THAT OCCUR :
      Here is the output I get. After the last printout (which looks a little
      abnormal because there is not CMS-start etc.), the CPU usage goes to less than
      5%
      ===========================================================
      2e-06: [GC 0.000223: [DefNew
      Desired survivor size 819200 bytes, new threshold 1 (max 31)
      - age 1: 1638392 bytes, 1638392 total
      : 13183K->1599K(14784K), 0.2473340 secs] 13183K->12065K(80320K), 0.2475070
      secs]
      0.605896: [GC 0.606014: [DefNew
      Desired survivor size 819200 bytes, new threshold 1 (max 31)
      - age 1: 1638400 bytes, 1638400 total
      : 14783K->1600K(14784K), 0.2707170 secs] 25249K->24091K(80320K), 0.2708460
      secs]
      1.28198: [GC 1.28209: [DefNew
      Desired survivor size 819200 bytes, new threshold 1 (max 31)
      - age 1: 1638400 bytes, 1638400 total
      : 14783K->1600K(14784K), 0.2862180 secs] 37275K->36563K(80320K), 0.2863430
      secs]
      1.89358: [GC 1.89366: [DefNew
      Desired survivor size 819200 bytes, new threshold 1 (max 31)
      - age 1: 1638400 bytes, 1638400 total
      : 14783K->1600K(14784K), 0.2903200 secs] 49747K->49136K(80320K), 0.2904670
      secs]
      2.18609: [GC [1 CMS-initial-mark: 47536K(65536K)] 49164K(80320K), 0.0095590
      secs]
      2.19582: [CMS-concurrent-mark-start]
      ======++++ size = 432343
      2.69469: [CMS-concurrent-mark: 0.498/0.499 secs]
      2.69478: [CMS-concurrent-preclean-start]
      2.69481: [CMS-concurrent-preclean: 0.000/0.000 secs]
      2.71154: [GC2.71158: [dirty card accumulation, 0.0003120 secs]2.71193: [dirty
      card rescan, 0.0103110 secs]2.72229: [remark from roots, 0.0865510
      secs]2.80891: [weak refs processing, 0.0000310 secs] [1 CMS-remark:
      47536K(65536K)] 59224K(80320K), 0.0974280 secs]
      2.80918: [CMS-concurrent-sweep-start]
      2.97609: [CMS-concurrent-sweep: 0.164/0.167 secs]
      2.97647: [CMS-concurrent-reset-start]
      2.99376: [CMS-concurrent-reset: 0.017/0.017 secs]
      3.39866: [GC 3.39881: [DefNew
      Desired survivor size 819200 bytes, new threshold 1 (max 31)
      - age 1: 1638400 bytes, 1638400 total
      : 14784K->1600K(14784K), 0.1437860 secs] 59597K->51252K(80320K), 0.1439460
      secs]
      3.54648: [GC [1 CMS-initial-mark: 49652K(65536K)] 51266K(80320K), 0.0063560
      secs]
      3.55297: [CMS-concurrent-mark-start]
      4.06014: [CMS-concurrent-mark: 0.505/0.507 secs]
      4.06023: [CMS-concurrent-preclean-start]
      4.06025: [CMS-concurrent-preclean: 0.000/0.000 secs]
      4.08755: [GC4.0876: [dirty card accumulation, 0.0002860 secs]4.08791: [dirty
      card rescan, 0.0113610 secs]4.09933: [remark from roots, 0.0358710
      secs]4.13527: [weak refs processing, 0.0000400 secs] [1 CMS-remark:
      49652K(65536K)] 53036K(80320K), 0.0477920 secs]
      4.13552: [CMS-concurrent-sweep-start]
      4.54434: [CMS-concurrent-sweep: 0.405/0.409 secs]
      4.54446: [CMS-concurrent-reset-start]
      4.55817: [CMS-concurrent-reset: 0.014/0.014 secs]
      ======++++ size = 106
      7.48048: [GC 7.48057: [DefNew: 14784K->14784K(14784K), 0.0000400 secs]7.48064:
      [CMS7.77193: [dirty card accumulation, 0.0003070 secs]7.77231: [dirty card
      rescan, 0.0028040 secs]7.77517: [remark from roots, 0.0729060 secs]7.84815:
      [weak refs processing, 0.0000400 secs]: 33821K->33821K(65536K), 0.4708740
      secs] 48605K->48605K(80320K), 0.4711200 secs]
       
      ==========================================================
      Here is the output I obtain when I send a SIGQUIT to the process. All threads
      are waiting for monitors!
      ==========================================================
       
      Full thread dump Java HotSpot(TM) Server VM (1.4.1_01-b01 mixed mode):
       
      "DestroyJavaVM" prio=1 tid=0x0x39407940 nid=0x4d05 waiting on condition
      [0..7fffd2c0]
       
      "Thread-20" prio=1 tid=0x0x39406e48 nid=0x4d24 waiting for monitor entry
      [39b0b000..39b0b840]
              at java.util.HashMap.addEntry(HashMap.java:719)
              at java.util.HashMap.put(HashMap.java:388)
              at ConcurrentGCProblem$Worker.run(ConcurrentGCProblem.java:27)
              - locked <0x3078a110> (a java.util.HashMap)
       
      "Thread-19" prio=1 tid=0x0x39406350 nid=0x4d23 waiting for monitor entry
      [39a8a000..39a8a840]
              at ConcurrentGCProblem$Worker.run(ConcurrentGCProblem.java:27)
              - waiting to lock <0x30796e08> (a java.util.HashMap)
       
      "Thread-18" prio=1 tid=0x0x39405858 nid=0x4d22 waiting for monitor entry
      [39a09000..39a09840]
              at ConcurrentGCProblem$Worker.run(ConcurrentGCProblem.java:27)
              - waiting to lock <0x30797f60> (a java.util.HashMap)
       
      "Thread-17" prio=1 tid=0x0x39404d60 nid=0x4d21 waiting for monitor entry
      [39988000..39988840]
              at ConcurrentGCProblem$Worker.run(ConcurrentGCProblem.java:27)
              - waiting to lock <0x30797240> (a java.util.HashMap)
       
      "Thread-16" prio=1 tid=0x0x39404268 nid=0x4d20 waiting for monitor entry
      [39907000..39907840]
              at java.util.HashMap.addEntry(HashMap.java:719)
              at java.util.HashMap.put(HashMap.java:388)
              at ConcurrentGCProblem$Worker.run(ConcurrentGCProblem.java:27)
              - locked <0x30796ef8> (a java.util.HashMap)
       
      "Thread-15" prio=1 tid=0x0x39403770 nid=0x4d1f waiting for monitor entry
      [39886000..39886840]
              at ConcurrentGCProblem$Worker.run(ConcurrentGCProblem.java:27)
              - waiting to lock <0x3078a160> (a java.util.HashMap)
       
      "Thread-14" prio=1 tid=0x0x39402c78 nid=0x4d1e waiting for monitor entry
      [39805000..39805840]
              at java.util.HashMap.addEntry(HashMap.java:719)
              at java.util.HashMap.put(HashMap.java:388)
              at ConcurrentGCProblem$Worker.run(ConcurrentGCProblem.java:27)
              - locked <0x30796ca0> (a java.util.HashMap)
       
      "Thread-13" prio=1 tid=0x0x39402180 nid=0x4d1d waiting for monitor entry
      [39784000..39784840]
              at ConcurrentGCProblem$Worker.run(ConcurrentGCProblem.java:25)
              - waiting to lock <0x3078a0e8> (a java.util.HashMap)
       
      "Thread-12" prio=1 tid=0x0x39401688 nid=0x4d1c runnable [39703000..39703840]
              at java.util.HashMap.addEntry(HashMap.java:719)
              at java.util.HashMap.put(HashMap.java:388)
              at ConcurrentGCProblem$Worker.run(ConcurrentGCProblem.java:27)
              - locked <0x3078a0e8> (a java.util.HashMap)
       
      "Thread-11" prio=1 tid=0x0x39400b90 nid=0x4d1b waiting for monitor entry
      [39682000..39682840]
              at ConcurrentGCProblem$Worker.run(ConcurrentGCProblem.java:27)
              - waiting to lock <0x307973d0> (a java.util.HashMap)
       
      "Thread-10" prio=1 tid=0x0x80f3c98 nid=0x4d1a waiting for monitor entry
      [39601000..39601840]
              at ConcurrentGCProblem$Worker.run(ConcurrentGCProblem.java:27)
              - waiting to lock <0x30797e48> (a java.util.HashMap)
       
      "Thread-9" prio=1 tid=0x0x80f2fd8 nid=0x4d19 waiting for monitor entry
      [39580000..39580840]
              at ConcurrentGCProblem$Worker.run(ConcurrentGCProblem.java:25)
              - waiting to lock <0x30796ef8> (a java.util.HashMap)
       
      "Thread-8" prio=1 tid=0x0x80f2318 nid=0x4d18 waiting for monitor entry
      [393b8000..393b8840]
              at ConcurrentGCProblem$Worker.run(ConcurrentGCProblem.java:27)
              - waiting to lock <0x30796c00> (a java.util.HashMap)
       
      "Thread-7" prio=1 tid=0x0x80f1418 nid=0x4d17 waiting for monitor entry
      [39337000..39337840]
              at ConcurrentGCProblem$Worker.run(ConcurrentGCProblem.java:27)
              - waiting to lock <0x30796e58> (a java.util.HashMap)
       
      "Thread-6" prio=1 tid=0x0x80f0798 nid=0x4d16 waiting for monitor entry
      [392b6000..392b6840]
              at java.util.HashMap.addEntry(HashMap.java:719)
              at java.util.HashMap.put(HashMap.java:388)
              at ConcurrentGCProblem$Worker.run(ConcurrentGCProblem.java:27)
              - locked <0x307971f0> (a java.util.HashMap)
       
      "Thread-5" prio=1 tid=0x0x80efc18 nid=0x4d15 waiting for monitor entry
      [39235000..39235840]
              at ConcurrentGCProblem$Worker.run(ConcurrentGCProblem.java:27)
              - waiting to lock <0x30797218> (a java.util.HashMap)
       
      "Thread-4" prio=1 tid=0x0x80ef098 nid=0x4d14 waiting for monitor entry
      [391b4000..391b4840]
              at ConcurrentGCProblem$Worker.run(ConcurrentGCProblem.java:27)
              - waiting to lock <0x30796db8> (a java.util.HashMap)
       
      "Thread-3" prio=1 tid=0x0x80ee500 nid=0x4d13 waiting for monitor entry
      [39133000..39133840]
              at ConcurrentGCProblem$Worker.run(ConcurrentGCProblem.java:27)
              - waiting to lock <0x30796fe8> (a java.util.HashMap)
       
      "Thread-2" prio=1 tid=0x0x80ed998 nid=0x4d12 waiting for monitor entry
      [390b2000..390b2840]
              at ConcurrentGCProblem$Worker.run(ConcurrentGCProblem.java:27)
              - waiting to lock <0x30797d80> (a java.util.HashMap)
       
      "Thread-1" prio=1 tid=0x0x80ecde8 nid=0x4d11 waiting for monitor entry
      [39031000..39031840]
              at ConcurrentGCProblem$Worker.run(ConcurrentGCProblem.java:27)
              - waiting to lock <0x30796d90> (a java.util.HashMap)
       
      "Signal Dispatcher" daemon prio=1 tid=0x0x80bb468 nid=0x4d0e waiting on
      condition [0..0]
       
      "Surrogate Locker Thread (CMS)" daemon prio=1 tid=0x0x80ba7b0 nid=0x4d0b
      runnable [0..2b1939f8]
       
      "Finalizer" daemon prio=1 tid=0x0x80b56b8 nid=0x4d0a in Object.wait()
      [388c1000..388c1840]
              at java.lang.Object.wait(Native Method)
              - waiting on <0x3078c0b0> (a java.lang.ref.ReferenceQueue$Lock)
              at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:111)
              - locked <0x3078c0b0> (a java.lang.ref.ReferenceQueue$Lock)
              at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:127)
              at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159)
       
      "Reference Handler" daemon prio=1 tid=0x0x80b43a0 nid=0x4d09 in Object.wait()
      [38840000..38840840]
              at java.lang.Object.wait(Native Method)
              - waiting on <0x3078c050> (a java.lang.ref.Reference$Lock)
              at java.lang.Object.wait(Object.java:426)
              at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:113)
              - locked <0x3078c050> (a java.lang.ref.Reference$Lock)
       
      "VM Thread" prio=1 tid=0x0x80b3948 nid=0x4d08 runnable
       
      "VM Periodic Task Thread" prio=1 tid=0x0x80bb2c8 nid=0x4d0c waiting on
      condition
      "Suspend Checker Thread" prio=1 tid=0x0x80bb360 nid=0x4d0d runnable


      REPRODUCIBILITY :
      This bug can be reproduced often.

      ---------- BEGIN SOURCE ----------

       
      import java.util.*;
       
      public class ConcurrentGCProblem
      {
      public static final int TARGET_SIZE = 100000;

      static final class Worker extends Thread {
      Worker(boolean print) { print_ = print; }
      private final boolean print_;
      public void run() {
      int dataSize = 0;
      int count = 0;
      while (true) {
      Map m = maps_[(int)(Math.random() * maps_.length)];
      boolean add;
      double diff = (double)(dataSize-TARGET_SIZE);
      if (diff < 0) {
      diff = -diff;
      add = (2 * Math.random() + diff/TARGET_SIZE > 1);
      }
      else {
      add = (2 * Math.random() + diff/TARGET_SIZE < 1);
      }
      synchronized (m) {
      if (add) {
      m.put(new Double(Math.random()),
                                    new int[(int)(Math.random()*32)]);
      dataSize++;
      }
      else
      if (!m.isEmpty()) {
      Iterator iter =
      m.keySet().iterator();
      iter.next();
      iter.remove();
      dataSize--;
      }
      }
      if (++count % 25000 == 0) {
      dataSize = 0;
      for (int i = 0; i<maps_.length; i++) {
      synchronized (maps_[i]) {
      dataSize += maps_[i].size();
      }
      }
      if (print_)
      System.out.println("======++++size = " + dataSize);
      }
      }
      }
      }
       
      static final Map[] maps_ = new Map[100];
       
      public static void main(String[] args) {
      for (int i = 0; i<maps_.length; i++)
      maps_[i] = new HashMap();
       
      for (int i = 0; i<20; i++)
      (new Worker(i==0)).start();
      }
      }

      ---------- END SOURCE ----------

      CUSTOMER WORKAROUND :
      By playing with the memory allocations, you can reduce how
      often the bug happens. But I have not systematic way the
      work around this, other than not using
      -XX:+UseConcMarkSweepGC
      (Review ID: 178997)
      ======================================================================

            jmasa Jon Masamitsu (Inactive)
            jleesunw Jon Lee (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved:
              Imported:
              Indexed: