Issue | Fix Version | Assignee | Priority | Status | Resolution | Resolved In Build |
---|---|---|---|---|---|---|
JDK-2042215 | 1.4.0 | Clifford Click | P2 | Resolved | Fixed | beta2 |
Name: kb87695 Date: 04/05/2001
The Solaris 131 hotspot VM fails with javac test
case of jvm98 version 1.03_05 19981203. In the default version, this bug
is pretty sporadic and does not happen always. If we stress the VM
with -Xcomp and -XX:+ScavengeALot and -XX:+FullGCALot with both
intervals set as 1000, we can reproduce it quite reliably.
Case 1:
This is a solaris compiler bug that presents itself in SPECjvm tests
running _213_javac. In the normal mode it runs fine. However, if we
stress the VMwith excessive compilation and excessive scavenging
and gcing, the VM fails in various sort of ways. These effects are not
in the -Xint mode.
1. It could get a signal and core dump.
2. It gets a broken oop and then halts(we run with
-XX:+VerifyOops).
3. Starts giving out incorrect warnings and errors
from the javac program. In the normal run, no
such warnings or errors are produced.
I am giving the required command line that produces the bug and
the tail of a log file of the output. We have to use the debug build
in order to get the appropriate flags that stress the VMappropriately.
The bug appears after about an hour or so.
Going to the top directory of jvm98, just fire
off.
$ java_g -Xcomp -XX:+PrintCompilation
-XX:+ScavengeALot-XX:ScavengeALotInterval=1000 -XX:+VerifyOops
-XX:+FullGCALot -XX:FullGCALotInterval=1000 -Xmx256m -Xms256m
SpecApplication -s100 -M4 -m4 -g -d3000 -a _213_javac
The tail of the output is like this:
Full gc no: 51 Interval: 783
1447 b spec.benchmarks._213_javac.ArrayExpression::checkInitializer (108 bytes)
1448 b spec.benchmarks._213_javac.BinaryLogicalExpression::checkValue (32
bytes)
1449 b spec.benchmarks._213_javac.ExprExpression::checkCondition (26 bytes)
1450 b spec.benchmarks._213_javac.AndExpression::checkCondition (90 bytes)
EXECUTION STOPPED: verify_oop: L3: broken oop
verify_oop: L3: broken oop
Execution stopped, print registers?
# SafepointSynchronize::begin: Fatal error:
# SafepointSynchronize::begin: Timed out while attempting to reach a safepoint.
# SafepointSynchronize::begin: Threads which did not reach the safepoint:
# nid=0x1 runnable
# SafepointSynchronize::begin: (End of list)
Case 2:
I think that an incorrect oopmap has been created by the
compiler and thus is not properly updated by the garbage
collector. I have also created a slight modification to the
code and then caused this bug to appear in a small test case.
This breaks the modified VM instantaneously.
I.) The Change in the solaris code.
< instruct checkCastPP( iRegP dst) %{
< match(Set dst (CheckCastPP dst));
< format %{ "#checkcastPP of $dst" %}
< ins_encode( /*empty encoding*/ );
< ins_pipe(empty);
---
> instruct checkCastPP( iRegP dst, iRegP src) %{
> match(Set dst (CheckCastPP src));
> format %{ "#checkcastPP of $dst <- $src" %}
> ins_encode( form3_g0_rs2_rd_move( src, dst )
);
> ins_pipe(ialu_reg);
I guess this change should not change the
correctness of the VM code generated,
although it does produce inefficient code with
redundant instructions.
II.) The test case:
/* Run this application in this form
$ java_g -Xcomp -XX:CompileOnly=.Test -XX:-Inline -XX:NewSize=1048576 -XX:+PrintScavenge -XX:+PrintCompilation
GcOopTest
*/
public class GcOopTest {
public static final int ARRAY_SIZE = 1000000;
public static void doGC(byte[] byteArray, int size) {
int [] forceSvavengeArray = new int[size/2];
for(int i = 0; i < size; i++) {
byteArray[i] = 10;
}
//System.gc();
}
public static boolean Test(int size) {
size = size/2;
byte [] byteArray = new byte [size];
for(int i = 0; i < size; i++) {
byteArray[i] = 1;
}
doGC(byteArray, size);
for(int i = 0; i < size; i++) {
if(byteArray[i] != 10)
return false;
}
return true;
}
public static void main(String[] args) {
int size = ARRAY_SIZE;
char [] a = new char [1];
byte [] b = new byte [1];
a[0] = 1;
b[0] = 1;
boolean test = false;
if((test = Test(size)) == true)
System.out.println("Ok");
else
System.out.println("Error");
}
}
III.) The output of the test (incorrect). If the
output is Ok, then it is correct.
$ java_g -Xcomp -XX:CompileOnly=.Test -XX:-Inline -XX:NewSize=1048576 -XX:+PrintScavenge -XX:+PrintCompilation GcOopTest
VM option 'CompileOnly=.Test'
VM option '-Inline'
VM option 'NewSize=1048576'
VM option '+PrintScavenge'
VM option '+PrintCompilation'
1 b GcOopTest::Test (55 bytes)
[GC 635K->572K(2368K), 0.1161145 secs]
Error
This shows an incorrect run.
IV) The opto-assembly dump.
I have also attatched the OptoAssembly dump of the
method concerned. Please notice my comments around basic
block B20. They are encapsulated with << Manosiz ... >>. There
is a copy of register R_L4 to R_L2 done by the CheckCastPP
code.
Even though in the oopmap that is generated at the
safepoint (method entry point) contains only R_L2, we use R_L4
after we come back from the method call. If during the time
between the called method and its return a scavenge occurs and
objects get moved, only R_L2 will be patched and not R_L4. And hence the bug.
At the compiled method, we will be pointing at the old object, while the
new object has already been moved.
In short this is what happened.
1. The compiled method does not inline anything.
Thus doGC is a interpreter call.
2. In the compiled method the oopmap only marks
R_L2 as holding the only oop, while R_L4 also contains an oop.
3. After we enter the interpreted method, due to
the reduced size of the new generation space, a scavenge occurs which
moves the related oop.
4. The pointer in R_L2 is updated according to the
oopmap but R_L4 is not.
5. We return back to the compiled code and use
R_L4 and see ourselves pointing at the old oop and not the new one.
(Review ID: 120072)
======================================================================
- backported by
-
JDK-2042215 hs131 fails on jvm98
-
- Resolved
-