I made about 73 SpecApplication runs, over about 50 CPU hours.
The runs were 3 at a time on Krajy, a 500 MHz 4-way machine.
Two runs ended with the "not a call imm32" assert.
(NOTE: There were also 8 assertion failures with the
message "GC active during No_GC_Verifier"; this is an
unrelated bug!)
Here is a partial transcript analyzing one of the two failures.
--------
date
Tue May 7 04:54:10 PDT 2002
--------
./gamma_gfast -XX:+ShowMessageBoxOnError -Xmx32m SpecApplication -a -s100 -m5 -M5 _227_mtrt _202_jess _201_compress _209_db _222_mpegaudio _228_jack _213_javac
# Executing gamma in fastdebug directory...
VM option '+ShowMessageBoxOnError'
VM option '+ShowMessageBoxOnError'
Speed will be 100
Will run each benchmark at least 5 times
Will run each benchmark at most 5 times
Caching Off Speed = 100 Auto run mode
======= _227_mtrt Starting =======
Run 0 start. Total memory=2031616 free memory=1841448
+100 to 199 by 200
+0 to 99 by 200
Addr: 0xda0e2482 Code: 0xeb >>>> JMP not CALL
#
# HotSpot Virtual Machine Error, Internal Error
# Please report this error at
# http://java.sun.com/cgi-bin/bugreport.cgi
#
# Java VM: Java HotSpot(TM) Server VM (1.4.1-internal-debug mixed mode)
#
# Fatal: not a call imm32
#
# Error ID: /net/lukasiewicz/export/lukasiewicz1/jrose/hopper/src/cpu/i486/vm/nativeInst_i486.cpp, 21
#
# Problematic Thread: prio=5 tid=0x81c1620 nid=0xe runnable
#
Internal Error
Fatal: not a call imm32
Do you want to debug the problem?
# SafepointSynchronize::begin: Fatal error:
# SafepointSynchronize::begin: Timed out while attempting to reach a safepoint.
# SafepointSynchronize::begin: Threads which did not reach the safepoint:
# Thread: 0x81c1620 [0x e] State: _call_back pc: 0x0
nid=0xe runnable
# SafepointSynchronize::begin: (End of list)
...
--------
dbx - 7169
... >>>> See FAIL4.threads for a full set of backtraces.
(dbx) w 4
current thread: t@14
=>[6] report_fatal(file_name = ???, line_no = ???, format = ???, ...) (optimized), at 0xdebcea99 (line ~139) in "jrose/hopper/src/share/vm/utilities/debug.cpp"
[7] NativeCall::verify(this = ???) (optimized), at 0xdf054270 (line ~21) in "jrose/hopper/src/cpu/i486/vm/nativeInst_i486.cpp"
[8] SharedRuntime::reresolve_call_site(caller = CLASS) (optimized), at 0xdf170715 (line ~622) in "jrose/hopper/src/share/vm/runtime/sharedRuntime.cpp"
[9] OptoRuntime::handle_wrong_method(thread = ???) (optimized), at 0xdf14af91 (line ~489) in "jrose/hopper/src/share/vm/opto/runtime.cpp"
>>>> handle_wrong_method is called for both missed monomorphic V-calls
and also for zombie methods. Since the call site in question is
calling the VEP of 0xda0d558c, the trip through handle_wrong_method
was probably handling a zombie method, the previous version of
0xda0d558c.
<<<<
(dbx) p -fx *(unsigned char*)0xda0e2482
*((unsigned char *) 0xda0e2482U) = 0xe8
(dbx) x 0xda0e2482/i
0xda0e2482: call 0xda0d558c <0xda0d558c>
>>>> Instruction is a call, again => A RACE CONDITION.
>>>> Note that the instruction displacement lies across a word boundary.
(dbx) x 0xda0d558c-12/60i
>>>> Examine callee Java method.
>>>UEP:
0xda0d5580: cmpl 4(%ecx),%eax
0xda0d5583: jne 0xda0a97d0 <0xda0a97d0> >>>WRONG METHOD
0xda0d5589: nop
0xda0d558a: nop
0xda0d558b: nop
>>>VEP:
0xda0d558c: movl %eax,0xffffc000(%esp,1) >>>BANG STACK
0xda0d5593: subl $0xc,%esp >>> MAKE FRAME
0xda0d5599: flds 8(%edx)
0xda0d559c: fstp %st(2)
0xda0d559e: flds 16(%edx)
0xda0d55a1: fstp %st(1)
0xda0d55a3: flds 8(%ecx)
0xda0d55a6: fstp %st(6)
0xda0d55a8: flds 16(%esp,1)
0xda0d55ac: fmul %st(1),%st
0xda0d55ae: fstps 0(%esp,1)
... (rest of code is in FAIL4.callee)
>>>> Examine caller Java method.
(dbx) x 0xda0e1e00/772i
... (rest of code is in FAIL4.caller)
0xda0e246c: flds 24(%ebx)
0xda0e246f: fmul %st(2),%st
0xda0e2471: fstps 104(%esp,1)
0xda0e2475: movl %ebp,%ecx
0xda0e2477: flds 104(%esp,1)
0xda0e247b: fstp %st(2)
0xda0e247d: fld %st(1)
0xda0e247f: fstps (%esp,1)
0xda0e2482: call 0xda0d558c <0xda0d558c> >>>> CALL VEP OF CALLEE
0xda0e2487: movl 8(%edi),%ecx
0xda0e248a: movl 8(%ecx),%ebp
...
<<<<
The only places where 0xEB is patched in over a call are:
NativeCall::replace_mt_safe
(apparently unused)
NativeCall::set_destination_mt_safe
used only in compiledIC.cpp, several times
1. CompiledIC::set_ic_destination (virtual calls only)
2. All other occurrences are in CompiledStaticCall methods.
Since the callee method is virtual, set_ic_destination is
the likely culprit. These are the calls to set_ic_destination:
1. Two in icBuffer.cpp, for transition stubs.
2. CompiledIC::set_to_monomorphic is used on first call.
3. CompiledIC::set_to_clean is used to reinitialize.
Since the call site is non-virtual, only 2 or 3 are likely.
The most likely scenario is a method becoming a zombie,
which is then called simultaneously by two processors
from the same call site, leading to a race between
two threads running reresolve_call_site.
The runs were 3 at a time on Krajy, a 500 MHz 4-way machine.
Two runs ended with the "not a call imm32" assert.
(NOTE: There were also 8 assertion failures with the
message "GC active during No_GC_Verifier"; this is an
unrelated bug!)
Here is a partial transcript analyzing one of the two failures.
--------
date
Tue May 7 04:54:10 PDT 2002
--------
./gamma_gfast -XX:+ShowMessageBoxOnError -Xmx32m SpecApplication -a -s100 -m5 -M5 _227_mtrt _202_jess _201_compress _209_db _222_mpegaudio _228_jack _213_javac
# Executing gamma in fastdebug directory...
VM option '+ShowMessageBoxOnError'
VM option '+ShowMessageBoxOnError'
Speed will be 100
Will run each benchmark at least 5 times
Will run each benchmark at most 5 times
Caching Off Speed = 100 Auto run mode
======= _227_mtrt Starting =======
Run 0 start. Total memory=2031616 free memory=1841448
+100 to 199 by 200
+0 to 99 by 200
Addr: 0xda0e2482 Code: 0xeb >>>> JMP not CALL
#
# HotSpot Virtual Machine Error, Internal Error
# Please report this error at
# http://java.sun.com/cgi-bin/bugreport.cgi
#
# Java VM: Java HotSpot(TM) Server VM (1.4.1-internal-debug mixed mode)
#
# Fatal: not a call imm32
#
# Error ID: /net/lukasiewicz/export/lukasiewicz1/jrose/hopper/src/cpu/i486/vm/nativeInst_i486.cpp, 21
#
# Problematic Thread: prio=5 tid=0x81c1620 nid=0xe runnable
#
Internal Error
Fatal: not a call imm32
Do you want to debug the problem?
# SafepointSynchronize::begin: Fatal error:
# SafepointSynchronize::begin: Timed out while attempting to reach a safepoint.
# SafepointSynchronize::begin: Threads which did not reach the safepoint:
# Thread: 0x81c1620 [0x e] State: _call_back pc: 0x0
nid=0xe runnable
# SafepointSynchronize::begin: (End of list)
...
--------
dbx - 7169
... >>>> See FAIL4.threads for a full set of backtraces.
(dbx) w 4
current thread: t@14
=>[6] report_fatal(file_name = ???, line_no = ???, format = ???, ...) (optimized), at 0xdebcea99 (line ~139) in "jrose/hopper/src/share/vm/utilities/debug.cpp"
[7] NativeCall::verify(this = ???) (optimized), at 0xdf054270 (line ~21) in "jrose/hopper/src/cpu/i486/vm/nativeInst_i486.cpp"
[8] SharedRuntime::reresolve_call_site(caller = CLASS) (optimized), at 0xdf170715 (line ~622) in "jrose/hopper/src/share/vm/runtime/sharedRuntime.cpp"
[9] OptoRuntime::handle_wrong_method(thread = ???) (optimized), at 0xdf14af91 (line ~489) in "jrose/hopper/src/share/vm/opto/runtime.cpp"
>>>> handle_wrong_method is called for both missed monomorphic V-calls
and also for zombie methods. Since the call site in question is
calling the VEP of 0xda0d558c, the trip through handle_wrong_method
was probably handling a zombie method, the previous version of
0xda0d558c.
<<<<
(dbx) p -fx *(unsigned char*)0xda0e2482
*((unsigned char *) 0xda0e2482U) = 0xe8
(dbx) x 0xda0e2482/i
0xda0e2482: call 0xda0d558c <0xda0d558c>
>>>> Instruction is a call, again => A RACE CONDITION.
>>>> Note that the instruction displacement lies across a word boundary.
(dbx) x 0xda0d558c-12/60i
>>>> Examine callee Java method.
>>>UEP:
0xda0d5580: cmpl 4(%ecx),%eax
0xda0d5583: jne 0xda0a97d0 <0xda0a97d0> >>>WRONG METHOD
0xda0d5589: nop
0xda0d558a: nop
0xda0d558b: nop
>>>VEP:
0xda0d558c: movl %eax,0xffffc000(%esp,1) >>>BANG STACK
0xda0d5593: subl $0xc,%esp >>> MAKE FRAME
0xda0d5599: flds 8(%edx)
0xda0d559c: fstp %st(2)
0xda0d559e: flds 16(%edx)
0xda0d55a1: fstp %st(1)
0xda0d55a3: flds 8(%ecx)
0xda0d55a6: fstp %st(6)
0xda0d55a8: flds 16(%esp,1)
0xda0d55ac: fmul %st(1),%st
0xda0d55ae: fstps 0(%esp,1)
... (rest of code is in FAIL4.callee)
>>>> Examine caller Java method.
(dbx) x 0xda0e1e00/772i
... (rest of code is in FAIL4.caller)
0xda0e246c: flds 24(%ebx)
0xda0e246f: fmul %st(2),%st
0xda0e2471: fstps 104(%esp,1)
0xda0e2475: movl %ebp,%ecx
0xda0e2477: flds 104(%esp,1)
0xda0e247b: fstp %st(2)
0xda0e247d: fld %st(1)
0xda0e247f: fstps (%esp,1)
0xda0e2482: call 0xda0d558c <0xda0d558c> >>>> CALL VEP OF CALLEE
0xda0e2487: movl 8(%edi),%ecx
0xda0e248a: movl 8(%ecx),%ebp
...
<<<<
The only places where 0xEB is patched in over a call are:
NativeCall::replace_mt_safe
(apparently unused)
NativeCall::set_destination_mt_safe
used only in compiledIC.cpp, several times
1. CompiledIC::set_ic_destination (virtual calls only)
2. All other occurrences are in CompiledStaticCall methods.
Since the callee method is virtual, set_ic_destination is
the likely culprit. These are the calls to set_ic_destination:
1. Two in icBuffer.cpp, for transition stubs.
2. CompiledIC::set_to_monomorphic is used on first call.
3. CompiledIC::set_to_clean is used to reinitialize.
Since the call site is non-virtual, only 2 or 3 are likely.
The most likely scenario is a method becoming a zombie,
which is then called simultaneously by two processors
from the same call site, leading to a race between
two threads running reresolve_call_site.
- relates to
-
JDK-4681431 occasional false assert thrown by JRT_LEAF routines
- Closed