-
Bug
-
Resolution: Other
-
P4
-
None
-
hs14, 6u85, 7, 8
-
generic
-
generic
Here is a link to a failure after v1.6r22 was used:
http://sqeweb.sfbay.sun.com/nfs/tools/gtee/results/JDK7/NIGHTLY/VM/2008-09-15/Main_Baseline/index.html
Here is a snippet of the log file:
[2008-09-16T16:15:05.53] File red completely
[2008-09-16T16:15:05.53] hs204t001R
[2008-09-16T16:15:05.53] started running thread..
[2008-09-16T16:15:05.53] ... Inside doThisThread..
[2008-09-16T16:15:05.53] State..1108280
[2008-09-16T16:15:05.53] comming out ..
[2008-09-16T16:15:05.53] index =1108290
[2008-09-16T16:15:05.53] # Test level exit status: 96
It looks like sometimes the worker thread prints the "comming out"
line and sometimes it doesn't. Not sure why and not sure if that
matters to the failure mode.
Here is a link to the oldest nightly failure that I found:
http://sqeweb.sfbay.sun.com/nfs/tools/gtee/results/JDK7/NIGHTLY/VM/2008-08-29/Main_Baseline/index.html
Here is a snippet of the logfile:
[2008-08-30T16:11:50.05] MyClass :: Successfully redefined..
[2008-08-30T16:11:50.05] hs204t001R
[2008-08-30T16:11:50.05] started running thread..
[2008-08-30T16:11:50.05] ... In side doThisThread..
[2008-08-30T16:11:50.05] State..149400
[2008-08-30T16:11:50.05] comming out ..
[2008-08-30T16:11:50.05] index =149410
[2008-08-30T16:11:50.05] ... Failed ..
[2008-08-30T16:11:50.05] # Test level exit status: 1
This failure mode is the same as the 2009.03.03 failure, but
the messages are a little bit different. The testbase for
this failure claims to be v1.6r21. The fix for 6354035 was
integrated in v1.6r22.
Here is an interesting entry from my nightly analysis report:
New nsk.jvmti failures (from 2009.03.03)
nsk/jvmti/scenarios/hotswap/HS204/hs204t001
This test failed due to an exit code of 96 on Linux IA32 Client
VM -Xmixed (machine wowamd). The exit code is set as follows:
System.exit(Consts.JCK_STATUS_BASE + run(args, System.out));
The run() method is just a wrapper around the runIt() method:
return new hs204t001().runIt(args, out);
and runIt() returns as follows:
return (t1.getIndex() == afterSuspend ? 0 : 1);
The interesting log output looks like:
State..9130
index =53640
# Test level exit status: 96
The index value shown in the "State.." output line is acquired
after the worker thread is up and running and before it is
suspended. This index value can vary from run to run. Another
index value is saved after the worker thread is suspended and
before popFrame() is called. This index value can also vary
from run to run, but should remain the same as long as the
worker thread is suspended. The test also sets the atomic "run"
variable to false at this time and that should prevent the
index value from being incremented. The index value in the
"index =" output line is acquired after popFrame() and
t1.join() are called. The test's popFrame() function also
resumes the worker thread. The test expects the current return
value from t1.getIndex() to match the index value saved after
the worker thread was suspended.
Generally speaking, when there is an "expect != actual" value
mismatch, the test should output both the expected value and
the actual value. We have the actual value from the "index ="
output line, but we don't have the expected value.
I see a possible problem in this test. The readme file
indicates that the suspendThread()/popFrame() logic pair
assumes that the popFrame() will result in the worker thread's
execution state being reset into the worker thread's run()
method where it will call into doInThisThread() again. While I
don't see any test logic to positively verify this assumption,
this assertion may be faulty because it assumes that
suspendThread() will catch the worker thread in
doInThreadThread() when it could be in hs204t001R.run.get() or
even hs204t001R.suspend.set() if it gets really, really lucky.
If the worker thread is suspended in one of those methods, then
the popFrame() will result in the worker thread's execution
state being reset into the doInThreadThread() method and not
into the worker thread's run() method. Depending on how
hs204t001R.run.get() is implemented, the suspendThread() could
catch the worker thread in a subroutine call down in get() and
the popFrame() will return execution somewhere in get().
If the thread was suspended in a subroutine call in get() after
the current value of the run field was fetched, then after the
thread is resumed, the now stale value of the run field is
returned and we'll get one last increment of the index field
before we loop around and see the real value of run.get().
This would explain a failure where expect != actual and the
difference is one increment value. Since we don't have the
expect value in the log, we don't know the difference between
expect and actual.
Update: run.get() is a simple getter and doesn't call any
other subroutines so that sinks my one theory about why
expect != actual. My search of nightly results has found 30
instances of this test failing. Most of the failures are in
GC baselines, some are in Main and this one is in
RT_Baseline. I don't know if they all fail the same way.
Here is a link to the run's analysis.html file:
http://sqeweb.sfbay/nfs/results/vm/gtee/JDK7/NIGHTLY/VM/2009-03-03/RT_Baseline/vm/linux-i586/client/mixed/linux-i586_client_mixed_nsk.jvmti.testlist/analysis.html
Here is a snippet of the failure log:
[2009-03-04T06:45:44.20] File red completely
[2009-03-04T06:45:44.20] hs204t001R
[2009-03-04T06:45:44.20] started running thread..
[2009-03-04T06:45:44.20] ... Inside doThisThread..
[2009-03-04T06:45:44.20] State..9130
[2009-03-04T06:45:44.20] index =53640
[2009-03-04T06:45:44.20] # Test level exit status: 96
http://sqeweb.sfbay.sun.com/nfs/tools/gtee/results/JDK7/NIGHTLY/VM/2008-09-15/Main_Baseline/index.html
Here is a snippet of the log file:
[2008-09-16T16:15:05.53] File red completely
[2008-09-16T16:15:05.53] hs204t001R
[2008-09-16T16:15:05.53] started running thread..
[2008-09-16T16:15:05.53] ... Inside doThisThread..
[2008-09-16T16:15:05.53] State..1108280
[2008-09-16T16:15:05.53] comming out ..
[2008-09-16T16:15:05.53] index =1108290
[2008-09-16T16:15:05.53] # Test level exit status: 96
It looks like sometimes the worker thread prints the "comming out"
line and sometimes it doesn't. Not sure why and not sure if that
matters to the failure mode.
Here is a link to the oldest nightly failure that I found:
http://sqeweb.sfbay.sun.com/nfs/tools/gtee/results/JDK7/NIGHTLY/VM/2008-08-29/Main_Baseline/index.html
Here is a snippet of the logfile:
[2008-08-30T16:11:50.05] MyClass :: Successfully redefined..
[2008-08-30T16:11:50.05] hs204t001R
[2008-08-30T16:11:50.05] started running thread..
[2008-08-30T16:11:50.05] ... In side doThisThread..
[2008-08-30T16:11:50.05] State..149400
[2008-08-30T16:11:50.05] comming out ..
[2008-08-30T16:11:50.05] index =149410
[2008-08-30T16:11:50.05] ... Failed ..
[2008-08-30T16:11:50.05] # Test level exit status: 1
This failure mode is the same as the 2009.03.03 failure, but
the messages are a little bit different. The testbase for
this failure claims to be v1.6r21. The fix for 6354035 was
integrated in v1.6r22.
Here is an interesting entry from my nightly analysis report:
New nsk.jvmti failures (from 2009.03.03)
nsk/jvmti/scenarios/hotswap/HS204/hs204t001
This test failed due to an exit code of 96 on Linux IA32 Client
VM -Xmixed (machine wowamd). The exit code is set as follows:
System.exit(Consts.JCK_STATUS_BASE + run(args, System.out));
The run() method is just a wrapper around the runIt() method:
return new hs204t001().runIt(args, out);
and runIt() returns as follows:
return (t1.getIndex() == afterSuspend ? 0 : 1);
The interesting log output looks like:
State..9130
index =53640
# Test level exit status: 96
The index value shown in the "State.." output line is acquired
after the worker thread is up and running and before it is
suspended. This index value can vary from run to run. Another
index value is saved after the worker thread is suspended and
before popFrame() is called. This index value can also vary
from run to run, but should remain the same as long as the
worker thread is suspended. The test also sets the atomic "run"
variable to false at this time and that should prevent the
index value from being incremented. The index value in the
"index =" output line is acquired after popFrame() and
t1.join() are called. The test's popFrame() function also
resumes the worker thread. The test expects the current return
value from t1.getIndex() to match the index value saved after
the worker thread was suspended.
Generally speaking, when there is an "expect != actual" value
mismatch, the test should output both the expected value and
the actual value. We have the actual value from the "index ="
output line, but we don't have the expected value.
I see a possible problem in this test. The readme file
indicates that the suspendThread()/popFrame() logic pair
assumes that the popFrame() will result in the worker thread's
execution state being reset into the worker thread's run()
method where it will call into doInThisThread() again. While I
don't see any test logic to positively verify this assumption,
this assertion may be faulty because it assumes that
suspendThread() will catch the worker thread in
doInThreadThread() when it could be in hs204t001R.run.get() or
even hs204t001R.suspend.set() if it gets really, really lucky.
If the worker thread is suspended in one of those methods, then
the popFrame() will result in the worker thread's execution
state being reset into the doInThreadThread() method and not
into the worker thread's run() method. Depending on how
hs204t001R.run.get() is implemented, the suspendThread() could
catch the worker thread in a subroutine call down in get() and
the popFrame() will return execution somewhere in get().
If the thread was suspended in a subroutine call in get() after
the current value of the run field was fetched, then after the
thread is resumed, the now stale value of the run field is
returned and we'll get one last increment of the index field
before we loop around and see the real value of run.get().
This would explain a failure where expect != actual and the
difference is one increment value. Since we don't have the
expect value in the log, we don't know the difference between
expect and actual.
Update: run.get() is a simple getter and doesn't call any
other subroutines so that sinks my one theory about why
expect != actual. My search of nightly results has found 30
instances of this test failing. Most of the failures are in
GC baselines, some are in Main and this one is in
RT_Baseline. I don't know if they all fail the same way.
Here is a link to the run's analysis.html file:
http://sqeweb.sfbay/nfs/results/vm/gtee/JDK7/NIGHTLY/VM/2009-03-03/RT_Baseline/vm/linux-i586/client/mixed/linux-i586_client_mixed_nsk.jvmti.testlist/analysis.html
Here is a snippet of the failure log:
[2009-03-04T06:45:44.20] File red completely
[2009-03-04T06:45:44.20] hs204t001R
[2009-03-04T06:45:44.20] started running thread..
[2009-03-04T06:45:44.20] ... Inside doThisThread..
[2009-03-04T06:45:44.20] State..9130
[2009-03-04T06:45:44.20] index =53640
[2009-03-04T06:45:44.20] # Test level exit status: 96