Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-6769931

Heap corruption when CMSIncrementalMode is used on J2SDK1.4.2_17

    XMLWordPrintable

Details

    • Bug
    • Resolution: Cannot Reproduce
    • P2
    • 1.4.2_21
    • 1.4.2_17
    • hotspot
    • None
    • gc
    • generic
    • solaris_10

    Description

      DB has the java applications running on both production and non-production machines.

      They have experienced application crashed on non-production machines twice per week.

      The core files have revealed that the heap referenced to some objects got invalid. This is probably a GC issue. We made the initial suggestion to set both Xms and Xmx to 512m and PermSize and MaxPermSize to 64M.

      Having compared the JVM options between production and non-production machines, we found production machine doesn't have -XX:+CMSIncrementalMode. Thus, we've suggested them to take out this parameter and we haven't heard any crashes thus far.

      Here are the JVM options in non-production and production environment:

      Non-production

      JAVA_MEMSET="-server -ms${JAVA_MIN_MEM} -mx${JAVA_MAX_MEM} -XX:NewSize=96m -XX:MaxNewSize=96m -XX:PermSize=32m -XX:MaxPermSize=64m -XX:MaxTenuringThreshold=4 -XX:SurvivorRatio=4 -XX:+UseParNewGC -XX:+CMSParallelRemarkEnabled -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode -XX:CMSInitiatingOccupancyFraction=50 -XX:+CMSIncrementalPacing -XX:CMSIncrementalDutyCycleMin=0 -XX:CMSIncrementalDutyCycle=10 -XX:CMSMarkStackSize=32M -XX:+CMSClassUnloadingEnabled -XX:+CMSPermGenSweepingEnabled -XX:+UseCMSCompactAtFullCollection -XX:CMSFullGCsBeforeCompaction=0 -XX:+DisableExplicitGC -Dfile.encoding="UTF-8" -Dweblogic.PeriodLength=120000 -Dweblogic.IdlePeriodsUntilTimeout=4 -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintTenuringDistribution -Xloggc:/sprisw1/sit1/actatr01/IntegrationServer/logs/gc.log"


      Production

      JAVA_MEMSET="-server -ms${JAVA_MIN_MEM} -mx${JAVA_MAX_MEM} -XX:NewSize=256m -XX:MaxNewSize=256m -XX:SurvivorRatio=32 -XX:+UseConcMarkSweepGC -Dfile.encoding="UTF-8" -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Dweblogic.PeriodLength=120000 -Dweblogic.IdlePeriodsUntilTimeout=8 -Xloggc:/pprisw/prod/aptphu01/IntegrationServer/logs/gc.log"


      Here is one of the corefile

      pkg_core_MOSIM_091008

      current thread: t@35
      =>[1] __lwp_kill(0x0, 0x6, 0x0, 0xff33c000, 0x0, 0x0), at 0xff320218
      [2] raise(0x6, 0x0, 0xc257ec70, 0x0, 0x0, 0x0), at 0xff2d0c80
      [3] abort(0x0, 0x1, 0x1, 0x28fc, 0xff0c2658, 0x409468), at 0xff2b6e98
      [4] os::abort(0x1, 0xff185611, 0x1, 0x7efefeff, 0x81010100, 0xff00), at 0xff0bd83c
      [5] VMError::report_and_die(0xff19bb44, 0xff19bb53, 0xff19bb63, 0xfecd4944, 0xc257f330, 0xc257f078), at 0xff123f28
      [6] JVM_handle_solaris_signal(0xfecd4944, 0xfecd4944, 0xff185115, 0x1, 0x0, 0x0), at 0xfeddae3c
      [7] __sighndlr(0xb, 0xc257f330, 0xc257f078, 0xfedda3f0, 0x0, 0x0), at 0xff3956c8
      ---- called from signal handler with signal 11 (SIGSEGV) ------
      [8] JVM_ArrayCopy(0x3a7838, 0xc257f480, 0xc257f47c, 0x31, 0xc257f478, 0x4), at 0xfecd4944
      [9] 0xf9c43ed4(0xd9877098, 0x31, 0xd94d49e8, 0x4, 0x5, 0x26b2b60c), at 0xf9c43ed4
      [10] 0xf9c8ae64(0xf18474f8, 0xd94cc6d0, 0xd94d49e8, 0x9, 0x4, 0xc257f4a8), at 0xf9c8ae64
      [11] 0xfa47a1dc(0xd94d2dd8, 0xfb6f4000, 0xd94d3760, 0x3a77a0, 0xd94d3860, 0xf18474f8), at 0xfa47a1dc
      [12] 0xf9c35f58(0xf2876050, 0xf18474f8, 0xf18474f8, 0xf18474f8, 0x0, 0xf18474f8), at 0xf9c35f58
      [13] 0xfa3b34e4(0xd99dac20, 0xd94cc870, 0x0, 0xf1847504, 0xf18474f8, 0xd94c6ba8), at 0xfa3b34e4
      [14] 0xfa38dd68(0xf1809020, 0xfb6f4000, 0x1, 0xf1d64e28, 0x1, 0x0), at 0xfa38dd68
      [15] 0xf9c46208(0xd98154f0, 0xb8, 0x8, 0xf9c16230, 0x0, 0xc257f6a8), at 0xf9c46208
      [16] 0xf9c05850(0xc257f820, 0xb8, 0x0, 0xf9c163b0, 0xc, 0xc257f728), at 0xf9c05850
      [17] 0xf9c05850(0xc257f8a8, 0xb7, 0x0, 0xf9c15fb0, 0xc, 0xc257f7b8), at 0xf9c05850
      [18] 0xf9c05904(0xc257f928, 0x13, 0xf381e81c, 0xf9c16230, 0x8, 0xc257f848), at 0xf9c05904
      [19] 0xf9c7a7f8(0xd9800730, 0xd94c4c50, 0xfffffffc, 0xf9ec4f36, 0x4, 0xc257f8e8), at 0xf9c7a7f8
      [20] 0xfa229630(0xf18474f8, 0x0, 0xd94c4c00, 0x13, 0x1, 0xc257f950), at 0xfa229630
      [21] 0xfa3bc6b8(0x1, 0xf1838d20, 0xd94c4ba8, 0xd94c4bb0, 0x1, 0xd9800730), at 0xfa3bc6b8
      [22] 0xfa44fbec(0xf18474f8, 0xd7868628, 0xd9809fd0, 0x63, 0xd9809f70, 0xd9800020), at 0xfa44fbec
      [23] 0xf9d70cb8(0xc257fb9c, 0x0, 0xf3870a0c, 0xf9c18120, 0x10, 0xc257fab0), at 0xf9d70cb8
      [24] 0xf9c0020c(0xc257fc28, 0xc257fe90, 0xa, 0xf1dcded0, 0x4, 0xc257fb40), at 0xf9c0020c
      [25] JavaCalls::call_helper(0xc257fe88, 0xc257fcf0, 0xc257fda8, 0x3a77a0, 0x3a77a0, 0xc257fd00), at 0xfed5ff18
      [26] JavaCalls::call_virtual(0xff1a0000, 0x3a7d50, 0xc257fd9c, 0xc257fd98, 0xc257fda8, 0x3a77a0), at 0xfee4e0e0
      [27] JavaCalls::call_virtual(0xc257fe88, 0xc257fe84, 0xc257fe7c, 0xc257fe74, 0xc257fe6c, 0x3a77a0), at 0xfee6146c
      [28] thread_entry(0x3a77a0, 0x3a77a0, 0x180c10, 0x3a7d50, 0x334144, 0xfee6be20), at 0xfee72770
      [29] JavaThread::run(0x3a77a0, 0x23, 0x40, 0x0, 0x40, 0x0), at 0xfee6be48
      [30] java_start(0x3a77a0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xff0bcde0

      0xf9c43eac: cmp %i0, 0
      0xf9c43eb0: st %sp, [%l3 + 136]
      0xf9c43eb4: move %icc,%i0, %o2
      0xf9c43eb8: add %sp, 104, %o1
      0xf9c43ebc: mov 4, %l0
      0xf9c43ec0: st %l0, [%l3 + 196]
      0xf9c43ec4: st %i4, [%sp + 92]
      0xf9c43ec8: mov %i3, %o5
      0xf9c43ecc: mov %i1, %o3
      0xf9c43ed0: add %l3, 152, %o0
      0xf9c43ed4: call JVM_ArrayCopy ! 0xfecd4858
      0xf9c43ed8: mov %g2, %l7
      (dbx) x $l3 + 152
      0x003a7838: 0xff1e2bf0
      (dbx) x 0xff1e2bf0
      0xff1e2bf0: jni_NativeInterface : 0x00000000
      (dbx) x $sp + 104
      0xc257f480: 0xf18168f0
      (dbx) x 0xf18168f0
      0xf18168f0: 0x00000001
      (dbx) x $i0
      0xd9877098: 0x00000000
      (dbx) x $i1
      0x00000031: dbx: core file read error: address 0x31 not in data space
      (dbx) x $i2
      0xd94d49e8: 0x00000001
      (dbx) x $i3
      0x00000004: dbx: core file read error: address 0x4 not in data space

      The src is not an oop. We can find 0xd9877098 in the heap... but invalid when it is inspected

      Attachments

        Activity

          People

            kevinw Kevin Walls
            helai Herrick Lai (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:
              Imported:
              Indexed: