Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-6484560

(coll) High cpu usage in for loop inside HashMap.rehash()

    XMLWordPrintable

Details

    • Bug
    • Status: Closed
    • P2
    • Resolution: Not an Issue
    • 1.3.1_18
    • None
    • core-libs

    Description

      In customer's Portal there are 9 production web instances (3 web instances on each of 3 - 12 CPU - Sun Fire V1280 1280 servers) running on Sun J2SE 1.3.1 (_18), where it is observed that after running for some time, there are 0, 1 or more (up to 5 have at most have been observed) 100% usr looping threads, in any given web instance as the day progresses, were in each looping thread would consume 100% of 1 of 12 CPUs e.g. 5 looping threads would consume / waste 5 CPUs or 42% (5/12) of a servers total CPU throughput.

      Furthermore, once a thread starts looping due to this issue, it does not stop looping until the related web instance is recycled.

      Production environment consists of:
      Sun ONE Portal Server 6.0
      Sun Directory Server Access Management 5.1
      Sun ONE Web Server 6.0 SP7
      Sun Directory Server 5.1 SP3
      Sun J2SE 1.3.1_18
      Solaris 8 SPARC (Generic_117350-38)

      They already use LD_LIBRARY_PATH=/usr/lib/lwp:$LD_LIBRARY_PATH
      -XX:+OverrideDefaultLibthread

      SunPS (Nikolaos) identified from prstat and javadump that the treads using cpu where all running in line 292 of HashMap.rehash()

      at java.util.HashMap.rehash(HashMap.java:292)
      at java.util.HashMap.put(HashMap.java:344)
      at com.sun.portal.providers.jsp.Request.setAttribute(Request.java:78)
      ...

      e.g: a prstat -m -L 17744 clearly shows LWP #152 to be a looping thread:

      PID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG PROCESS/LWPID 17744 Onaportl 100 0.0 0.0 0.0 0.0 0.1 0.0 0.0 15 33 71 10 ns-httpd/152

      In addition, pstack 17744 output for LWP #152 yields:

      ----------------- lwp# 152 / thread# 152 --------------------
      f9d774f0 ???????? (17, 17, ffffffff, 0, 17479b8, 72)
      f9d3279c ???????? (0, f18379c8, cf408950, cf4088e8, 9, ea3020)
      f9f9fbfc ???????? (cf15b078, f20d7c48, cf408950, 0, 0, 77fa5948)
      f9c0795c ???????? (77fa59f0, f20d7c48, 77ecbca8, 346520, 21b400, 109a0)
      0033abe4 ???????? (6ee7eaac, 6ee7eb20, 6ee7eb24, 346628, c, 6ee7e880)
      0033aa7c ???????? (6ee7eb2c, 6ee7eba8, 6ee7ebac, 346520, c, 6ee7ea48)
      0033aa7c ???????? (6ee7ebb4, f19cb1f8, fe0e2638, 346520, c, 6ee7eac8)
      0033abe4 ???????? (6ee7ec4c, 81b9ca58, 77ecbca8, 346894, c, 6ee7eb50)
      0033aa7c ???????? (6ee7ecdc, f184c158, 10d, 346520, c, 6ee7ebd8)
      f9cc4698 ???????? (7683b5a8, 824a9bd0, 77eb1d78, 77eba020, 18, 6ee7ec88)
      fa012f90 ???????? (cf163c00, 824a9bd0, cf15ca90, cf15cb30, 0, 0)
      f9c0ad6c ???????? (7683b5a8, 0, 6ee7edb0, 6ee7edcc, 21b400, 109a0)
      0033aaa0 ???????? (6ee7ee64, f1db7620, 81b9caa8, 3466fc, 10, 6ee7ed60)
      0033aaa0 ???????? (6ee7eef4, f1e0b448, 81b9caa8, 3466fc, c, 6ee7edf8)
      0033aaa0 ???????? (6ee7ef9c, f1a1c3c0, 10d, 3466fc, c, 6ee7ee88)
      0033ac08 ???????? (6ee7f054, f1c50458, 6ee7f0d8, 346894, c, 6ee7ef20)
      0033ac08 ???????? (6ee7f0e4, f1bdac38, 0, 346894, 14, 6ee7efe8)
      0033ac08 ???????? (6ee7f1d4, 104, 6ee7f1ac, 346894, 10, 6ee7f070)
      0033aa7c ???????? (6ee7f2a4, 6ee7f304, 6ee7f308, 3466fc, c, 6ee7f158)
      0033aa7c ???????? (6ee7f310, ea3020, fe0e2638, 346520, c, 6ee7f240)
      f9c07a4c ???????? (81e2c2f0, 77eb1be8, 77eb1cc8, 2, c, 6ee7f2c8)
      f9f21cf8 ???????? (0, cf408d28, 81e2c2f0, 1, ea3020, 81c21280)
      f9f1608c ???????? (81c21280, 0, 6ee7f478, 6ee7f49c, 21b400, 109a0)
      0033aaa0 ???????? (6ee7f54c, 6ee7f5c4, 6ee7f5c8, 346520, 20, 6ee7f418)
      0033aaa0 ???????? (6ee7f5dc, 1, fe0f1238, 346520, 14, 6ee7f4e0)
      fe14ce08 __1cMStubRoutinesG_code1_ (6ee7f668, 6ee7f868, a, f192aca0, 18, 6ee7f568) + 3fc
      fdccd6b0 __1cJJavaCallsLcall_helper6FpnJJavaValue_pnMmethodHandle_pnRJavaCallArguments_pnGThread__v_ (6ee7f860, fe0e2638, 6ee7f794, ea3020, 33c8cc, 6ee7f868) +
      388
      fdd7658c __1cJJavaCallsEcall6FpnJJavaValue_nMmethodHandle_pnRJavaCallArguments_
      pnGThread__v_ (6ee7f6b4, 6ee7f774, 6ee7f794, fe0e2638, 6ee7f860, ea3020) + a8
      fddd7ca4 __1cKjni_invoke6FpnHJNIEnv__pnJJavaValue_pnI_jobject_nLJNICallType_pnK_jmethodID_pnSJNI_ArgumentPusher_pnGThread__v_ (6, ea3020, 61afb8, 1, 61a668, 6ee7f844) + 340
      fddf50e8 jni_CallIntMethodV (fe0e2638, ea3020, 61afb8, 6ee7f928, 61a668, ea30ac) + fc
      fe61d480 ???????? (ea30ac, 61afb8, 61a668, b8fff8, b8fffc, b90000)
      fe61a03c __0fUNSServletSessionImplHservicev (6ee7fa88, 0, b90000, 99f7b0, fe61bc94, 6ee7f948) + 2a4
      fe61e8ec ???????? (6ee7fa88, 0, 6ee7fa0c, 0, 65, eae689a0)
      fe5a7e94 __0fKNSJavaUtilJrunGlobalPFPv_vPviT (fe61e8c8, 6ee7fa88, 0, fe631c00, b6110ac0, fe631d34) + 84
      fe61fe98 NSServletService (230f80, 99f768, 99f7b0, fe632ba4, 228678, 0) + 420
      ff1b9194 __0FNfunc_exec_strP6KFuncStructP6GpblockP6HSessionP6HRequest (6bbe0, 230f80, 99f768, 99f7b0, 0, 0) + 1f0
      ff1ba384 INTobject_execute (58db38, 99f768, 99f7b0, 58db38, 99f6b8, ff287800) + 56c
      ff1be004 INTservact_service (99f768, 99f7b0, 0, 0, 0, ff287800) + 444
      ff1be558 INTservact_handle_processed (99f768, 99f7b0, 2, 22e1e0, fffffffc, 0) + 140
      ff1f3048 __0fLHttpRequestUUnacceleratedRespondPCcPc (99f6b8, 94c068, ff2976a8, 99f7b0, 99f768, ff297400) + 4e8
      ff1f2648 __0fLHttpRequestNHandleRequestP6Gnetbuf (99f6b8, 94c080, 94c068, 2000, 94a050, ff297400) + 590
      ff1f0df8 __0fNDaemonSessionDrunv (99ef68, ff296c00, ff296c00, ff1f20b8, 1, 0) + 40c
      ff2d407c ThreadMain (99ef68, ff2d4054, 17828, 0, 0, 0) + 28
      fed67698 _pt_root (99ff88, fed81074, 1, 0, 0, 0) + a4
      feb04c3c _lwp_start (0, 0, 0, 0, 0, 0)

      And above shows that operation on a Servlet's (NSServletService) session (0fUNSS
      ervletSessionImplHservicev) is at work here, and moreover the JNI calls are indi
      cative of JIT compiled C code that the JVM is hooking into.

      Furthermore truss output w.r.t. LWP #152 yields repeated execution of the follow
      ing (form):

      17744/152: Received signal #16, SIGUSR1 [caught]
      17744/152: siginfo: SIGUSR1 pid=17744 uid=16833 code=-1
      17744/152: sigprocmask(SIG_SETMASK, 0x6EE7E35C, 0x00000000) = 0
      17744/152: set = 0x7FBFFEFF 0x00001FF6 0 0
      17744/152: lwp_unpark(152, 0) = 0
      17744/152: lwp_park(0x00000000, 14) = 0
      17744/152: sigprocmask(SIG_SETMASK, 0xFEB17C18, 0x6EE7E110) = 0
      17744/152: set = 0xFFBFFEFF 0x00001FFF 0 0
      17744/152: oset = 0x7FBFFEFF 0x00001FF6 0 0
      17744/152: lwp_unpark(152, 1) = 0
      17744/152: setcontext(0x6EE7E120)
      17744/152: Incurred fault #1, FLTILL %pc = 0x00F4BDC0
      17744/152: siginfo: SIGILL ILL_ILLTRP addr=0x00F4BDC0
      17744/152: Received signal #4, SIGILL [caught]
      17744/152: siginfo: SIGILL ILL_ILLTRP addr=0x00F4BDC0
      17744/152: sigprocmask(SIG_SETMASK, 0x6EE7E35C, 0x00000000) = 0
      17744/152: set = 0x7FBFFEFF 0x00001FF6 0 0
      17744/152: sigprocmask(SIG_SETMASK, 0xFEB17C18, 0x6EE7E110) = 0
      17744/152: set = 0xFFBFFEFF 0x00001FFF 0 0
      17744/152: oset = 0x7FBFFEFF 0x00001FF6 0 0
      17744/152: lwp_unpark(152, 1) = 0
      17744/152: setcontext(0x6EE7E120)
      17744/152: lwp_unpark(14, 1) = 0
      17744/152: lwp_park(0x00000000, 0) = 0
      17744/152: lwp_park(0x00000000, 0) = 0

      And relevant Java thread dump (kill -3 17744) extract shows a Runnable thread executing in the address range [0x6ee7e000..0x6ee7f668] - which moreover directly correlates with pstack and truss output:

      [29/Sep/2006:21:46:07] config (17744): "Thread-176498" [29/Sep/2006:21:46:07] config (17744): prio=5 tid=0xea3020 [29/Sep/2006:21:46:07] config (17744): nid=0x98 [29/Sep/2006:21:46:07] config (17744): runnable [29/Sep/2006:21:46:07] config (17744): [0x6ee7e000..0x6ee7f668]
      at java.util.HashMap.rehash(HashMap.java:292)
      at java.util.HashMap.put(HashMap.java:344)
      at com.sun.portal.providers.jsp.Request.setAttribute(Request.java:78)
      at _jsps._usr._local._sunone._portal._SUNWps._web_apps._https_naportal2._portal._dwportal_en._HomeContainer._html._hometoptable_jsp._jspService(_hometoptable_jsp.java:882)
      at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:119)
      at javax.servlet.http.HttpServlet.service(HttpServlet.java:853)
      at com.sun.portal.providers.jsp.JspServletWrapper.service(JspServletWrapper.java:182)
      at com.sun.portal.providers.jsp.JSPProvider.processJspFile(JSPProvider.java:692)
      at com.sun.portal.providers.jsp.JSPProvider.processJspFile(JSPProvider.java:672)
      at com.sun.portal.providers.jsp.JSPProvider.getContent(JSPProvider.java:471)
      at com.gm.dwportal.dealerworld.common.DWJSPTableContainerProvider.getContent(DWJSPTableContainerProvider.java:109)
      at com.gm.dwportal.dealerworld.HomeContainerProvider.getContent(HomeContainerProvider.java:136)
      at com.sun.portal.desktop.context.PSContainerProviderContext.getContent(PSContainerProviderContext.java:367)
      at com.sun.portal.desktop.context.PSDesktopContext.getContent(PSDesktopContext.java:954)
      at com.sun.portal.desktop.DesktopServlet.doGetPost(DesktopServlet.java:493)
      at com.sun.portal.desktop.DesktopServlet.service(DesktopServlet.java:303)
      at javax.servlet.http.HttpServlet.service(HttpServlet.java:853)
      at com.iplanet.server.http.servlet.NSServletRunner.invokeServletService(NSServletRunner.java:919)
      at com.iplanet.server.http.servlet.WebApplication.service(WebApplication.java:1058)
      at com.iplanet.server.http.servlet.NSServletRunner.ServiceWebApp

      And clearly identifies the looping to be occuring on the line:
      java.util.HashMap.rehash(HashMap.java:292)


      Adding in .hotspot_compiler
         exclude java.util.HashMap rehash
      they get even more threads in the loop of the method rehash() with 100% cpu usage, and threads are still running inside the loop of rehash method e.g:

      for (int i = oldCapacity ; i-- > 0 ; ) { // 291
      for (Entry old = oldMap[i] ; old != null ; ) { // 292 [2 threads]
        Entry e = old; // 293 [7 threads]
        old = old.next; // 294 [3 threads]
        int index = (e.hash & 0x7FFFFFFF) % newCapacity; // 296 [13 threads]
        e.next = newMap[index]; // 297 [14 threads]
        newMap[index] = e; // 298 [5 threads]
      }
      }

      Has been found also threads with the following stack

      "Thread-98" waiting on monitor
      at java.lang.Object.wait(Native Method)
      at java.lang.Object.wait(Object.java:384)
      at com.sun.common.util.LockWithMemory.waitFor(LockWithMemory.java:32)
      at com.sun.common.util.LockWithMemory.waitFor(LockWithMemory.java:23)
      at com.sun.portal.desktop.context.ReusableProviderCaller.run(ReusableProviderCaller.java:121)

      In other stacks on sunsolve after ReusableProviderCaller the jasper JSPProvider is called to handle a jsp page
      The entire portal is made of JSP pages and JSP Providers.
      ReusableProviderCaller is a core Portal class that calls getContent() on all Providers utilizing different threads.

      No checks of gc activity can be executed on production server

      Attachments

        Activity

          People

            poonam Poonam Bajaj Parhar
            cmassi Claudio Massi (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:
              Imported:
              Indexed: