Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-6262235

NMethodSweeper::sweep takes a long time (mostly in nmethod::cleanup_inline_caches)

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: P1 P1
    • 6
    • 1.4.2_05
    • hotspot
    • b47
    • sparc
    • solaris_9
    • Not verified

        Java 1.4.2_05 with ParNewGC and CMS is "locking up" on us, in 64-bit mode with a 10gb heap... We are not really locked, just busy, with no feedback or progress (although we do respond to kill -QUIT for threaed dumps).

        It's possible we have a "serial" part of the safepoint process which doesn't scale particularly well. Installing _07 was planned for testing since before _08 was out, although the VM code mentioned in the stacks below looks the same in _08 and 1.5.

        The heap is filling up, and CMS is blocked waiting for the VMThread, which is shown in the pstack below performing a safepoint.

        CMS never gets to run in the time (half an hour?!) that the customer was able to wait.

        ----------------- lwp# 20 / thread# 20 --------------------
         ffffffff7efa81bc lwp_cond_wait (10011afd8, 10011afc0, 0)
         ffffffff7efa0090 _lwp_cond_wait (10011afd8, 10011afc0, 5594c0, ffffffff7e9510a4, 0, 0) + c
         ffffffff7e654054 int Monitor::wait(int,long) (ffffffff7eb70590, 1, 0, 0, 9000, 91a8) + f4
         ffffffff7e7e3038 void ConcurrentMarkSweepThread::synchronize(int) (1, ffffffff7efa54d4, 0, 10011afc0, ffffffff7e7e0448, 0) + 110
         ffffffff7e7e0520 void MarkFromRootsClosure::do_yield_work() (fffffffcb26ffb88, 37, 1b7, 1, 100710ba0, 100710c48) + f8
         ffffffff7e7e06e8 void MarkFromRootsClosure::scanOopsInOop(HeapWord*) (fffffffcb26ffb88, fffffffe88856190, fffffffd4ec00000, 0, 0, fffffffcb26ff650) + 1a8
         ffffffff7e7a3af0 void BitMap::iterate(BitMapClosure*,unsigned long,unsigned long) (4e71580, fffffffcb26ffb88, 0, 3cf80000, 100710ba0, 100710c48) + 80
         ffffffff7e7dc718 void CMSCollector::markFromRootsWork(int) (100710970, 1, ffffffff7e9ac085, 10011afc0, ffffffff7e7dc52c, 0) + 140
         ffffffff7e7dc564 void CMSCollector::markFromRoots(int) (100710970, 1, 0, 10011afc0, ffffffff7e7da134, 0) + 64
         ffffffff7e7da23c void CMSCollector::collect_in_background(int) (100710d68, 0, 0, 10011afc0, ffffffff7e7e2c78, 0) + 20c
         ffffffff7e7e2c78 void ConcurrentMarkSweepThread::run() (100710dc0, 0, b800, bac0, 8c00, 8ea0) + 258
         ffffffff7e6cc4fc _start (100710dc0, 0, 0, 0, 0, 0) + ec
         ffffffff7f3181a8 _lwp_start (0, 0, 0, 0, 0, 0)

        ----------------- lwp# 21 / thread# 21 --------------------
         ffffffff7e5c4d94 void RelocIterator::set_limits(unsigned char*,unsigned char*) (fffffffcafeff858, ffffffff3a73d060, ffffffff3a73d061, 0, 0, 0) + 70
         ffffffff7e5c4ce0 void RelocIterator::initialize(long,CodeBlob*,unsigned char*,unsigned char*) (fffffffcafeff858, 0, ffffffff3a73c750, ffffffff3a73d060, ffffffff3a73d061, 0) + 120
         ffffffff7e5e1100 RelocIterator virtual_call_Relocation::parse_ic(CodeBlob*&,unsigned char*&,unsigned char*&,oopDesc**&,int*) (fffffffcafeff960, fffffffcafeff9b8, fffffffcafeff9c8, fffffffcafeff9c0, 100a31f38, 100a31f98) + b8
         ffffffff7e5e536c CompiledIC::CompiledIC #Nvariant 1(Relocation*) (100a31f30, fffffffcafeffa88, ffffffff3a73c750, ffffffff3a73c8b4, 0, 0) + 34
         ffffffff7e625f84 void nmethod::cleanup_inline_caches() (ffffffff3a73c750, ffffffff3a73eec0, 0, 0, 9000, 91a8) + 1bc
         ffffffff7e691e54 void NMethodSweeper::sweep() (10011ecd0, 0, ffffffffffffffff, ffffffff7eb9b108, ffffffff7eb8b270, 0) + 1b4
         ffffffff7e692104 void SafepointSynchronize::begin() (10011ecd0, ffffffff7eb76224, b000, b088, b000, b388) + 224
         ffffffff7e74a428 void VMThread::loop() (7400, 9400, 9510, 9400, 9420, 8400) + 2c8
         ffffffff7e749c64 void VMThread::run() (10076f4f0, 40, 0, 0, 0, 0) + 94
         ffffffff7e6cc4fc _start (10076f4f0, 0, 0, 0, 0, 0) + ec
         ffffffff7f3181a8 _lwp_start (0, 0, 0, 0, 0, 0)



        Analysed a gcore from a previous lockup, and cleanup_inline_caches was going through a for... loop, so it should complete, but takes a long, long time for some reason.

        We can get Thread Dumps, so maybe the VMThread finishes the "sweep", does the thread dump and then does another sweep, not giving CMS a chance to finish it's mark? Although this sounds unlikely as all calls look like the same NMethodSweeper::sweep call (identical pstack line).

        The last thing SafepointSynchronize::begin() does is to call do_cleanup_tasks, which includes the call to NMethodSweeper::sweep() . So the sweep happens at every safepoint. So it must have comleted fine before, but now takes "forever"...

        sweep contains the for..loop which calls process_nmethod, which calls void nmethod::cleanup_inline_caches() .

        )$ grep cleanup_inline_caches psta*
        pstack1_1.log: ffffffff7e625f84 __1cHnmethodVcleanup_inline_caches6M_v_ (ffffffff3a73c750, ffffffff3a73eec0, 0, 0, 9000, 91a8) + 1bc
        pstack1_2.log: ffffffff7e625f84 __1cHnmethodVcleanup_inline_caches6M_v_ (ffffffff3b3d9210, ffffffff3b407880, 0, 0, 9000, 91a8) + 1bc
        pstack1_3.log: ffffffff7e625f84 __1cHnmethodVcleanup_inline_caches6M_v_ (ffffffff3b298b90, ffffffff3b29cd80, 0, 0, 9000, 91a8) + 1bc
        pstack1_5.log: ffffffff7e625f64 __1cHnmethodVcleanup_inline_caches6M_v_ (ffffffff3a3151d0, ffffffff3a3156c0, 0, 0, 9000, 91a8) + 19c
        pstack1_6.log: ffffffff7e625f84 __1cHnmethodVcleanup_inline_caches6M_v_ (ffffffff3b4c5450, ffffffff3b4cce00, 0, 0, 9000, 91a8) + 1bc


        .. so that shows 5 different NMethods. Sounds like we are iterating as designed, but it's taking so long that 6 sequential pstacks catch the same loop, and the application never appears to progress. Certainly, the CMS would log GC information if it ever got to run again...


        ###@###.### 2005-04-27 15:33:06 GMT

              never Tom Rodriguez
              kevinw Kevin Walls
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved:
                Imported:
                Indexed: