The customer is seeing regular SIG 11 errors in their application on both Solaris and Linux. The cause isn't clear, and isn't repeatable on both - it seems to happen differently depending on which OS the application runs on. The cause seems to be a bug in the GC system, and indeed they fairly regularly also see exits with this message:
Fatal: CMSMarkStack is full
even when running with -XX:CMSMarkStackSize=64M. These may or may not be unrelated problems. One problem seems to exist when they use CMS, but another surfaces when they switch it off, so that's not a workaround.
They see two different - but related Error IDs:
434F4E43555252454E542D41524B335745455027454E45524154494F4E0E4850500089
- which appears to mean the same as the CMSMarkStack Full,
& 4F530E43505002EF which appears to suggest a stack corruption?
_dl_sysinfo_int80 also appears a *lot* in any backtraces from gdb. I've attached a large amount of debugging data.
They are not using JNI or native threads, and unfortunately, the application here is *huge*, so a test-case seem unlikely.
Pstack output looks like:
15492: /home/xcolldev/java_1.4.2_05/j2re1.4.2_05/bin/java -server -Djava.security.policy=/sbcimp/dyn/data/RISK/XCOLL/LINUXUATBUILT/...
(No symbols found)
0xf65ebc32: ???? (805d314, 805d2fc, f6477d44, 805d314, 805d2fc, 0) + a4
0xf6222e1d: ???? (805d314, 805d2fc, f6477d44, 805efd8, 805efd8, feffa550) + 64
0xf62111c6: ???? (805d2c8, 0, 0, f6477d44, 805efd8, f6164670) + 20
0xf629d0a1: ???? (805813c, f61602d4, 805f078, 10004, f6380e2a, 0)
0xf616037e: ???? (f64662a0, f65c4e58, feffc704, 805830c, f6164670, 805f3ac) + 2060
0x08049b33: ???? (8, 805844c, feffc770, 0, f65c4e58, f6600020) + 40
0xf64a5748: ???? (8049250, 1a, feffc704, 8048dc0, 805430c, f65f7f50) + 1003908
###@###.### 2004-09-01: removed CMS reference from synopsis.
The crash does not need CMS.
Fatal: CMSMarkStack is full
even when running with -XX:CMSMarkStackSize=64M. These may or may not be unrelated problems. One problem seems to exist when they use CMS, but another surfaces when they switch it off, so that's not a workaround.
They see two different - but related Error IDs:
434F4E43555252454E542D41524B335745455027454E45524154494F4E0E4850500089
- which appears to mean the same as the CMSMarkStack Full,
& 4F530E43505002EF which appears to suggest a stack corruption?
_dl_sysinfo_int80 also appears a *lot* in any backtraces from gdb. I've attached a large amount of debugging data.
They are not using JNI or native threads, and unfortunately, the application here is *huge*, so a test-case seem unlikely.
Pstack output looks like:
15492: /home/xcolldev/java_1.4.2_05/j2re1.4.2_05/bin/java -server -Djava.security.policy=/sbcimp/dyn/data/RISK/XCOLL/LINUXUATBUILT/...
(No symbols found)
0xf65ebc32: ???? (805d314, 805d2fc, f6477d44, 805d314, 805d2fc, 0) + a4
0xf6222e1d: ???? (805d314, 805d2fc, f6477d44, 805efd8, 805efd8, feffa550) + 64
0xf62111c6: ???? (805d2c8, 0, 0, f6477d44, 805efd8, f6164670) + 20
0xf629d0a1: ???? (805813c, f61602d4, 805f078, 10004, f6380e2a, 0)
0xf616037e: ???? (f64662a0, f65c4e58, feffc704, 805830c, f6164670, 805f3ac) + 2060
0x08049b33: ???? (8, 805844c, feffc770, 0, f65c4e58, f6600020) + 40
0xf64a5748: ???? (8049250, 1a, feffc704, 8048dc0, 805430c, f65f7f50) + 1003908
###@###.### 2004-09-01: removed CMS reference from synopsis.
The crash does not need CMS.
- duplicates
-
JDK-5081418 SIG11 crash in ContiguousSpace::prepare_for_compaction on Linux
- Closed
- relates to
-
JDK-4615723 CMS: deal with CMS marking stack overflow
- Closed
-
JDK-6695968 CMS thread fails w/ 434F4E43555252454E542D41524B335745455027454E45524154494F4E0E4850500089
- Closed