DB has the java applications running on both production and non-production machines.
They have experienced application crashed on non-production machines twice per week.
The core files have revealed that the heap referenced to some objects got invalid. This is probably a GC issue. We made the initial suggestion to set both Xms and Xmx to 512m and PermSize and MaxPermSize to 64M.
Having compared the JVM options between production and non-production machines, we found production machine doesn't have -XX:+CMSIncrementalMode. Thus, we've suggested them to take out this parameter and we haven't heard any crashes thus far.
Here are the JVM options in non-production and production environment:
Non-production
JAVA_MEMSET="-server -ms${JAVA_MIN_MEM} -mx${JAVA_MAX_MEM} -XX:NewSize=96m -XX:MaxNewSize=96m -XX:PermSize=32m -XX:MaxPermSize=64m -XX:MaxTenuringThreshold=4 -XX:SurvivorRatio=4 -XX:+UseParNewGC -XX:+CMSParallelRemarkEnabled -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode -XX:CMSInitiatingOccupancyFraction=50 -XX:+CMSIncrementalPacing -XX:CMSIncrementalDutyCycleMin=0 -XX:CMSIncrementalDutyCycle=10 -XX:CMSMarkStackSize=32M -XX:+CMSClassUnloadingEnabled -XX:+CMSPermGenSweepingEnabled -XX:+UseCMSCompactAtFullCollection -XX:CMSFullGCsBeforeCompaction=0 -XX:+DisableExplicitGC -Dfile.encoding="UTF-8" -Dweblogic.PeriodLength=120000 -Dweblogic.IdlePeriodsUntilTimeout=4 -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintTenuringDistribution -Xloggc:/sprisw1/sit1/actatr01/IntegrationServer/logs/gc.log"
Production
JAVA_MEMSET="-server -ms${JAVA_MIN_MEM} -mx${JAVA_MAX_MEM} -XX:NewSize=256m -XX:MaxNewSize=256m -XX:SurvivorRatio=32 -XX:+UseConcMarkSweepGC -Dfile.encoding="UTF-8" -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Dweblogic.PeriodLength=120000 -Dweblogic.IdlePeriodsUntilTimeout=8 -Xloggc:/pprisw/prod/aptphu01/IntegrationServer/logs/gc.log"
Here is one of the corefile
pkg_core_MOSIM_091008
current thread: t@35
=>[1] __lwp_kill(0x0, 0x6, 0x0, 0xff33c000, 0x0, 0x0), at 0xff320218
[2] raise(0x6, 0x0, 0xc257ec70, 0x0, 0x0, 0x0), at 0xff2d0c80
[3] abort(0x0, 0x1, 0x1, 0x28fc, 0xff0c2658, 0x409468), at 0xff2b6e98
[4] os::abort(0x1, 0xff185611, 0x1, 0x7efefeff, 0x81010100, 0xff00), at 0xff0bd83c
[5] VMError::report_and_die(0xff19bb44, 0xff19bb53, 0xff19bb63, 0xfecd4944, 0xc257f330, 0xc257f078), at 0xff123f28
[6] JVM_handle_solaris_signal(0xfecd4944, 0xfecd4944, 0xff185115, 0x1, 0x0, 0x0), at 0xfeddae3c
[7] __sighndlr(0xb, 0xc257f330, 0xc257f078, 0xfedda3f0, 0x0, 0x0), at 0xff3956c8
---- called from signal handler with signal 11 (SIGSEGV) ------
[8] JVM_ArrayCopy(0x3a7838, 0xc257f480, 0xc257f47c, 0x31, 0xc257f478, 0x4), at 0xfecd4944
[9] 0xf9c43ed4(0xd9877098, 0x31, 0xd94d49e8, 0x4, 0x5, 0x26b2b60c), at 0xf9c43ed4
[10] 0xf9c8ae64(0xf18474f8, 0xd94cc6d0, 0xd94d49e8, 0x9, 0x4, 0xc257f4a8), at 0xf9c8ae64
[11] 0xfa47a1dc(0xd94d2dd8, 0xfb6f4000, 0xd94d3760, 0x3a77a0, 0xd94d3860, 0xf18474f8), at 0xfa47a1dc
[12] 0xf9c35f58(0xf2876050, 0xf18474f8, 0xf18474f8, 0xf18474f8, 0x0, 0xf18474f8), at 0xf9c35f58
[13] 0xfa3b34e4(0xd99dac20, 0xd94cc870, 0x0, 0xf1847504, 0xf18474f8, 0xd94c6ba8), at 0xfa3b34e4
[14] 0xfa38dd68(0xf1809020, 0xfb6f4000, 0x1, 0xf1d64e28, 0x1, 0x0), at 0xfa38dd68
[15] 0xf9c46208(0xd98154f0, 0xb8, 0x8, 0xf9c16230, 0x0, 0xc257f6a8), at 0xf9c46208
[16] 0xf9c05850(0xc257f820, 0xb8, 0x0, 0xf9c163b0, 0xc, 0xc257f728), at 0xf9c05850
[17] 0xf9c05850(0xc257f8a8, 0xb7, 0x0, 0xf9c15fb0, 0xc, 0xc257f7b8), at 0xf9c05850
[18] 0xf9c05904(0xc257f928, 0x13, 0xf381e81c, 0xf9c16230, 0x8, 0xc257f848), at 0xf9c05904
[19] 0xf9c7a7f8(0xd9800730, 0xd94c4c50, 0xfffffffc, 0xf9ec4f36, 0x4, 0xc257f8e8), at 0xf9c7a7f8
[20] 0xfa229630(0xf18474f8, 0x0, 0xd94c4c00, 0x13, 0x1, 0xc257f950), at 0xfa229630
[21] 0xfa3bc6b8(0x1, 0xf1838d20, 0xd94c4ba8, 0xd94c4bb0, 0x1, 0xd9800730), at 0xfa3bc6b8
[22] 0xfa44fbec(0xf18474f8, 0xd7868628, 0xd9809fd0, 0x63, 0xd9809f70, 0xd9800020), at 0xfa44fbec
[23] 0xf9d70cb8(0xc257fb9c, 0x0, 0xf3870a0c, 0xf9c18120, 0x10, 0xc257fab0), at 0xf9d70cb8
[24] 0xf9c0020c(0xc257fc28, 0xc257fe90, 0xa, 0xf1dcded0, 0x4, 0xc257fb40), at 0xf9c0020c
[25] JavaCalls::call_helper(0xc257fe88, 0xc257fcf0, 0xc257fda8, 0x3a77a0, 0x3a77a0, 0xc257fd00), at 0xfed5ff18
[26] JavaCalls::call_virtual(0xff1a0000, 0x3a7d50, 0xc257fd9c, 0xc257fd98, 0xc257fda8, 0x3a77a0), at 0xfee4e0e0
[27] JavaCalls::call_virtual(0xc257fe88, 0xc257fe84, 0xc257fe7c, 0xc257fe74, 0xc257fe6c, 0x3a77a0), at 0xfee6146c
[28] thread_entry(0x3a77a0, 0x3a77a0, 0x180c10, 0x3a7d50, 0x334144, 0xfee6be20), at 0xfee72770
[29] JavaThread::run(0x3a77a0, 0x23, 0x40, 0x0, 0x40, 0x0), at 0xfee6be48
[30] java_start(0x3a77a0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xff0bcde0
0xf9c43eac: cmp %i0, 0
0xf9c43eb0: st %sp, [%l3 + 136]
0xf9c43eb4: move %icc,%i0, %o2
0xf9c43eb8: add %sp, 104, %o1
0xf9c43ebc: mov 4, %l0
0xf9c43ec0: st %l0, [%l3 + 196]
0xf9c43ec4: st %i4, [%sp + 92]
0xf9c43ec8: mov %i3, %o5
0xf9c43ecc: mov %i1, %o3
0xf9c43ed0: add %l3, 152, %o0
0xf9c43ed4: call JVM_ArrayCopy ! 0xfecd4858
0xf9c43ed8: mov %g2, %l7
(dbx) x $l3 + 152
0x003a7838: 0xff1e2bf0
(dbx) x 0xff1e2bf0
0xff1e2bf0: jni_NativeInterface : 0x00000000
(dbx) x $sp + 104
0xc257f480: 0xf18168f0
(dbx) x 0xf18168f0
0xf18168f0: 0x00000001
(dbx) x $i0
0xd9877098: 0x00000000
(dbx) x $i1
0x00000031: dbx: core file read error: address 0x31 not in data space
(dbx) x $i2
0xd94d49e8: 0x00000001
(dbx) x $i3
0x00000004: dbx: core file read error: address 0x4 not in data space
The src is not an oop. We can find 0xd9877098 in the heap... but invalid when it is inspected
They have experienced application crashed on non-production machines twice per week.
The core files have revealed that the heap referenced to some objects got invalid. This is probably a GC issue. We made the initial suggestion to set both Xms and Xmx to 512m and PermSize and MaxPermSize to 64M.
Having compared the JVM options between production and non-production machines, we found production machine doesn't have -XX:+CMSIncrementalMode. Thus, we've suggested them to take out this parameter and we haven't heard any crashes thus far.
Here are the JVM options in non-production and production environment:
Non-production
JAVA_MEMSET="-server -ms${JAVA_MIN_MEM} -mx${JAVA_MAX_MEM} -XX:NewSize=96m -XX:MaxNewSize=96m -XX:PermSize=32m -XX:MaxPermSize=64m -XX:MaxTenuringThreshold=4 -XX:SurvivorRatio=4 -XX:+UseParNewGC -XX:+CMSParallelRemarkEnabled -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode -XX:CMSInitiatingOccupancyFraction=50 -XX:+CMSIncrementalPacing -XX:CMSIncrementalDutyCycleMin=0 -XX:CMSIncrementalDutyCycle=10 -XX:CMSMarkStackSize=32M -XX:+CMSClassUnloadingEnabled -XX:+CMSPermGenSweepingEnabled -XX:+UseCMSCompactAtFullCollection -XX:CMSFullGCsBeforeCompaction=0 -XX:+DisableExplicitGC -Dfile.encoding="UTF-8" -Dweblogic.PeriodLength=120000 -Dweblogic.IdlePeriodsUntilTimeout=4 -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintTenuringDistribution -Xloggc:/sprisw1/sit1/actatr01/IntegrationServer/logs/gc.log"
Production
JAVA_MEMSET="-server -ms${JAVA_MIN_MEM} -mx${JAVA_MAX_MEM} -XX:NewSize=256m -XX:MaxNewSize=256m -XX:SurvivorRatio=32 -XX:+UseConcMarkSweepGC -Dfile.encoding="UTF-8" -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Dweblogic.PeriodLength=120000 -Dweblogic.IdlePeriodsUntilTimeout=8 -Xloggc:/pprisw/prod/aptphu01/IntegrationServer/logs/gc.log"
Here is one of the corefile
pkg_core_MOSIM_091008
current thread: t@35
=>[1] __lwp_kill(0x0, 0x6, 0x0, 0xff33c000, 0x0, 0x0), at 0xff320218
[2] raise(0x6, 0x0, 0xc257ec70, 0x0, 0x0, 0x0), at 0xff2d0c80
[3] abort(0x0, 0x1, 0x1, 0x28fc, 0xff0c2658, 0x409468), at 0xff2b6e98
[4] os::abort(0x1, 0xff185611, 0x1, 0x7efefeff, 0x81010100, 0xff00), at 0xff0bd83c
[5] VMError::report_and_die(0xff19bb44, 0xff19bb53, 0xff19bb63, 0xfecd4944, 0xc257f330, 0xc257f078), at 0xff123f28
[6] JVM_handle_solaris_signal(0xfecd4944, 0xfecd4944, 0xff185115, 0x1, 0x0, 0x0), at 0xfeddae3c
[7] __sighndlr(0xb, 0xc257f330, 0xc257f078, 0xfedda3f0, 0x0, 0x0), at 0xff3956c8
---- called from signal handler with signal 11 (SIGSEGV) ------
[8] JVM_ArrayCopy(0x3a7838, 0xc257f480, 0xc257f47c, 0x31, 0xc257f478, 0x4), at 0xfecd4944
[9] 0xf9c43ed4(0xd9877098, 0x31, 0xd94d49e8, 0x4, 0x5, 0x26b2b60c), at 0xf9c43ed4
[10] 0xf9c8ae64(0xf18474f8, 0xd94cc6d0, 0xd94d49e8, 0x9, 0x4, 0xc257f4a8), at 0xf9c8ae64
[11] 0xfa47a1dc(0xd94d2dd8, 0xfb6f4000, 0xd94d3760, 0x3a77a0, 0xd94d3860, 0xf18474f8), at 0xfa47a1dc
[12] 0xf9c35f58(0xf2876050, 0xf18474f8, 0xf18474f8, 0xf18474f8, 0x0, 0xf18474f8), at 0xf9c35f58
[13] 0xfa3b34e4(0xd99dac20, 0xd94cc870, 0x0, 0xf1847504, 0xf18474f8, 0xd94c6ba8), at 0xfa3b34e4
[14] 0xfa38dd68(0xf1809020, 0xfb6f4000, 0x1, 0xf1d64e28, 0x1, 0x0), at 0xfa38dd68
[15] 0xf9c46208(0xd98154f0, 0xb8, 0x8, 0xf9c16230, 0x0, 0xc257f6a8), at 0xf9c46208
[16] 0xf9c05850(0xc257f820, 0xb8, 0x0, 0xf9c163b0, 0xc, 0xc257f728), at 0xf9c05850
[17] 0xf9c05850(0xc257f8a8, 0xb7, 0x0, 0xf9c15fb0, 0xc, 0xc257f7b8), at 0xf9c05850
[18] 0xf9c05904(0xc257f928, 0x13, 0xf381e81c, 0xf9c16230, 0x8, 0xc257f848), at 0xf9c05904
[19] 0xf9c7a7f8(0xd9800730, 0xd94c4c50, 0xfffffffc, 0xf9ec4f36, 0x4, 0xc257f8e8), at 0xf9c7a7f8
[20] 0xfa229630(0xf18474f8, 0x0, 0xd94c4c00, 0x13, 0x1, 0xc257f950), at 0xfa229630
[21] 0xfa3bc6b8(0x1, 0xf1838d20, 0xd94c4ba8, 0xd94c4bb0, 0x1, 0xd9800730), at 0xfa3bc6b8
[22] 0xfa44fbec(0xf18474f8, 0xd7868628, 0xd9809fd0, 0x63, 0xd9809f70, 0xd9800020), at 0xfa44fbec
[23] 0xf9d70cb8(0xc257fb9c, 0x0, 0xf3870a0c, 0xf9c18120, 0x10, 0xc257fab0), at 0xf9d70cb8
[24] 0xf9c0020c(0xc257fc28, 0xc257fe90, 0xa, 0xf1dcded0, 0x4, 0xc257fb40), at 0xf9c0020c
[25] JavaCalls::call_helper(0xc257fe88, 0xc257fcf0, 0xc257fda8, 0x3a77a0, 0x3a77a0, 0xc257fd00), at 0xfed5ff18
[26] JavaCalls::call_virtual(0xff1a0000, 0x3a7d50, 0xc257fd9c, 0xc257fd98, 0xc257fda8, 0x3a77a0), at 0xfee4e0e0
[27] JavaCalls::call_virtual(0xc257fe88, 0xc257fe84, 0xc257fe7c, 0xc257fe74, 0xc257fe6c, 0x3a77a0), at 0xfee6146c
[28] thread_entry(0x3a77a0, 0x3a77a0, 0x180c10, 0x3a7d50, 0x334144, 0xfee6be20), at 0xfee72770
[29] JavaThread::run(0x3a77a0, 0x23, 0x40, 0x0, 0x40, 0x0), at 0xfee6be48
[30] java_start(0x3a77a0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xff0bcde0
0xf9c43eac: cmp %i0, 0
0xf9c43eb0: st %sp, [%l3 + 136]
0xf9c43eb4: move %icc,%i0, %o2
0xf9c43eb8: add %sp, 104, %o1
0xf9c43ebc: mov 4, %l0
0xf9c43ec0: st %l0, [%l3 + 196]
0xf9c43ec4: st %i4, [%sp + 92]
0xf9c43ec8: mov %i3, %o5
0xf9c43ecc: mov %i1, %o3
0xf9c43ed0: add %l3, 152, %o0
0xf9c43ed4: call JVM_ArrayCopy ! 0xfecd4858
0xf9c43ed8: mov %g2, %l7
(dbx) x $l3 + 152
0x003a7838: 0xff1e2bf0
(dbx) x 0xff1e2bf0
0xff1e2bf0: jni_NativeInterface : 0x00000000
(dbx) x $sp + 104
0xc257f480: 0xf18168f0
(dbx) x 0xf18168f0
0xf18168f0: 0x00000001
(dbx) x $i0
0xd9877098: 0x00000000
(dbx) x $i1
0x00000031: dbx: core file read error: address 0x31 not in data space
(dbx) x $i2
0xd94d49e8: 0x00000001
(dbx) x $i3
0x00000004: dbx: core file read error: address 0x4 not in data space
The src is not an oop. We can find 0xd9877098 in the heap... but invalid when it is inspected