-
Bug
-
Resolution: Duplicate
-
P4
-
None
-
1.1, 1.3.0, 1.3.1
-
sparc
-
solaris_7, solaris_8
Name: yyT116575 Date: 11/07/2000
java version "1.3.0"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.3.0)
Java HotSpot(TM) Client VM (build 1.3.0, mixed mode)
In a scheduling system for multiple parallel system jobs the following problem
occurrs erratically:
signal fault in critical section
signal number: 11, signal code: 1, fault address:
0xfee0be20, pc: 0xff36c6e8, s
p: 0xf3f80b60
libthread panic: fault in libthread critical section (PID: 6700 LWP 1)
stacktrace:
ff36c6cc
ff36c518
fe5db94c
fe604534
fe562170
ff044ec8
ff047ec4
6fce8
6ceb0
fe7a4994
fe549d3c
fe5499e8
fe551aec
fe5522ac
ff043c98
ff03f56c
6fce8
6ceb0
6ceb0
6ceb0
fe7a4994
fe549d3c
fe5493d4
fe549444
fe57a2a8
fe651a10
fe5ed2e8
ff37bd04
fe5ed2c8
This is a non-reproducible event, ie it occurs once during a runtime of several
days with several thousand jobs started successfully. All required and
recommended patches are installed on the machine. No exceptions are being thrown
and the VM continues to run.
The Java code causing this event contains a simple call to Runtime.exec, called
in a separate Thread:
private class JobThread extends Thread {
private Job job;
private String exec;
private File dir;
JobThread(Job job, String exec, File dir) {
this.job = job;
this.exec = exec;
this.dir = dir;
}
public void run() {
this.job.startTime = new Date();
try {
Process process = Runtime.getRuntime().exec(exec, null, dir);
job.setProcess(process);
process.waitFor();
job.collectProcess();
}
catch(IOException ioe) {
System.err.println("IO Exception on exec");
ioe.printStackTrace();
}
catch(InterruptedException ine) {
System.err.println("Interrupted Exception on exec");
ine.printStackTrace();
}
}
}
The panic event occurs on calling the start() method of above Thread.
(Review ID: 111640)
======================================================================
Name: yyT116575 Date: 03/09/2001
java version "1.3.0"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.3.0)
Java HotSpot(TM) Client VM (build 1.3.0, mixed mode)
This is the error log of what's been happening with Brazil that's been running on the machine pn21.eng. We've been flooding it with lots of HTTP requests from other machines connected to it via fiber.
However, a more interesting bug resently showed up. I started testing Brazil running CGI applications. This one is a simple one, a PERL script that just dumps the environment variables. You can see the page at:
http://pn21.eng/cgi/cgi_test.cgi
And the source is included.
Below the Java exceptions you can see there is a libthread panic. This did not show up until I added the CGI test today.
Thought everyone should know.
Server started on 80
Setting server to run as user: nobody
Created new table: 1/6
Created new table: 2/6
Created new table: 3/6
Created new table: 4/6
Created new table: 5/6
signal fault in critical section
signal number: 11, signal code: 1, fault address:
0x7be01f50, pc: 0xff3638ec, sp: 0x79e00978
libthread panic: fault in libthread critical section : dumping core (PID: 11472
LWP 1)
stacktrace:
ff3638d8
ff3639b4
ff367adc
fe7d7e5c
fb0a9e4c
fb0aa49c
fe7a4994
fe549d3c
fe5499e8
fe551aec
fe5522ac
fe7d3c98
fe7cf56c
11f8b0
fb0a3de8
11ca78
fb09e684
fb037874
fb04207c
fb037874
fb070940
fb0606fc
fb07e758
fe7a4994
fe549d3c
fe5493d4
fe549444
fe57a2a8
fe651a10
fe5ed2e8
ff36bb34
fe5ed2c8
(Review ID: 118521)
======================================================================
Name: yyT116575 Date: 07/18/2001
java version "1.3.0"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.3.0)
Java HotSpot(TM) Client VM (build 1.3.0, mixed mode)
I'm developing a java program whose purpose in life is to execute a bunch of programs, some which can be run in parallel and some which must be run sequentially. Most of the programs to be executed are perl scripts. The java program is pretty simple - it creates a thread for each group of tasks that can be run concurrently, and within the thread, it exec's the program to be run with /bin/sh -c "program args...". It then reads the stdout and stderr of the exec'ed program while it is alive, and when the program terminates, it goes onto the next task.
The exec'ed processes crash sometimes. An example of the output produced is included below. A core file is produced, but the core file is not from /bin/sh or perl. Rather, it is a java core file. My guess is that Runtime.exec is implemented by calling fork(), and then calling exec(). When fork() is called, all the threads in the parent process are recreated in the child process, and sometimes one of those threads gets a few time slices before exec() is called. The state of the new process is not such that the other threads can run appropriately, and once in a while they cause java to crash.
(Review ID: 128353)
======================================================================