Issue | Fix Version | Assignee | Priority | Status | Resolution | Resolved In Build |
---|---|---|---|---|---|---|
JDK-2139690 | 1.3.1_20 | Mike Belopuhov | P3 | Closed | Won't Fix |
A licensee reports that a dead lock seems to occur in the program which creates
30 threads.
The issue may depends on platforms because I can not reproduce
in single CPU Pentium machine.
CONFIGURATION :
OSFWin2K, WinXP
CPU:Pentium III (dual), Pentium IV (HyperThreading)
JDK:JDK1.4.1_07,JDK1.4.2_03,JDK1.5.0b31
REPRODUCE:
This may occur in multi CPUs and Hyperthreading of Pentium.
1) Compile attached test programs
2) Launch "java LocalHostTest"
If you see dos prompt, the program terminates normally.
If not, a dead lock occurs.
Press "ctl+break" key in order to get thread dump.
In the later case, you will see the following trace information
It shows Thread-2048 does not return after calling getLocalHostName().
--- Trace ----
Full thread dump Java HotSpot(TM) Client VM (1.4.2_03-b02 mixed mode):
"DestroyJavaVM" prio=5 tid=0x00034ef8 nid=0x2380 waiting on condition [0..7fad8]
"Thread-2048" prio=5 tid=0x0f036650 nid=0x1c78 runnable [2b44f000..2b44fd8c]
at java.net.Inet4AddressImpl.getLocalHostName(Native Method)
at java.net.InetAddress.getLocalHost(InetAddress.java:1178)
at LocalHostThread.run(LocalHostThread.java:10)
"Signal Dispatcher" daemon prio=10 tid=0x00a28498 nid=0x23b8 waiting on conditio
n [0..0]
"Finalizer" daemon prio=9 tid=0x009f0e28 nid=0x23a0 in Object.wait() [2baf000..2
bafd8c]
at java.lang.Object.wait(Native Method)
- waiting on <0x1053ed30> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:111)
- locked <0x1053ed30> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:127)
at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159)
"Reference Handler" daemon prio=10 tid=0x009ef9f8 nid=0x2398 in Object.wait() [2
b6f000..2b6fd8c]
at java.lang.Object.wait(Native Method)
- waiting on <0x1052edc8> (a java.lang.ref.Reference$Lock)
at java.lang.Object.wait(Object.java:429)
at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:115)
- locked <0x1052edc8> (a java.lang.ref.Reference$Lock)
"VM Thread" prio=5 tid=0x00a276f8 nid=0x2390 runnable
"VM Periodic Task Thread" prio=10 tid=0x00a2ac78 nid=0x2358 waiting on condition
"Suspend Checker Thread" prio=10 tid=0x00a38b48 nid=0x23a8 runnable
-------------
CUSTOMER'S INVESTIGATION :
They point out there is a problem at initSockFnTable()
in ./j2se/src/windows/hpi/src/socket_md.c.
=== Extracted source code ===
static void
initSockFnTable() {
int (PASCAL FAR* WSAStartupPtr)(WORD, LPWSADATA);
WSADATA wsadata;
OSVERSIONINFO info;
mutexInit(&sockFnTableMutex); <=====(a)
mutexLock(&sockFnTableMutex); <=====(b)
if (sockfnptrs_initialized == FALSE) {
...
// init process <==== (c)
...
}
sockfnptrs_initialized = TRUE;
mutexUnlock(&sockFnTableMutex); <=====(d)
}
=== Extracted source code end =====
Although the initSockFnTable() is called from somewhere in socket_md.c,
it is not locked exclusively.
When initSockFnTable() is invoked by several threads, several init process
of initSockFnTable() start simultaneously.
The followings are the scenario of above trace dump.
Suppose, there are 3 threads, A, B and C.
(1) Thread A and thread B call sysGetHostName in socket_md.c
and initSockFnTable() is called in order to initialize sockfnptr[] .
(2) Thread A executes the line (a) a little bit earlier than thread B.
(3) Thread B executes the line (a), B initializes sockFnTableMutex.
-> the information written by thread A is overwritten by thread B.
(4) Thread A executes the line (b), A can get the mutex and start to do the
init process of line (c).
(5) Thread B executes the line (a), B can not get mutex and enters into wait
status.
(6) The 3rd thread, C tries to execute sysGetHostName before thread A doesn't
finish the line (c).
Because thread C considers the initialization does not finish, C also
tries to execute initSockFnTable(). Then C executes the line (a).
-> The sockFnTableMutex is initialized twice at step (3) and step (6).
(7) When thread A finishes initializing at the line (c), thread A tries to
unlock the mutex at (d). However, because the mutex is rewritten
by thread C(and B), thread A can not unlock at the line (d).
So, thread B and C can not get mutex because the mutex is never unlocked.
PROBLEM and REQUEST:
The problem is that the sockFnTableMutex is initialized many times
by several threads.
SUN should implement that sockFnTableMutex is initialized only one time
in the same process even if there are a lot of threads at the same time.
==========================================================
30 threads.
The issue may depends on platforms because I can not reproduce
in single CPU Pentium machine.
CONFIGURATION :
OSFWin2K, WinXP
CPU:Pentium III (dual), Pentium IV (HyperThreading)
JDK:JDK1.4.1_07,JDK1.4.2_03,JDK1.5.0b31
REPRODUCE:
This may occur in multi CPUs and Hyperthreading of Pentium.
1) Compile attached test programs
2) Launch "java LocalHostTest"
If you see dos prompt, the program terminates normally.
If not, a dead lock occurs.
Press "ctl+break" key in order to get thread dump.
In the later case, you will see the following trace information
It shows Thread-2048 does not return after calling getLocalHostName().
--- Trace ----
Full thread dump Java HotSpot(TM) Client VM (1.4.2_03-b02 mixed mode):
"DestroyJavaVM" prio=5 tid=0x00034ef8 nid=0x2380 waiting on condition [0..7fad8]
"Thread-2048" prio=5 tid=0x0f036650 nid=0x1c78 runnable [2b44f000..2b44fd8c]
at java.net.Inet4AddressImpl.getLocalHostName(Native Method)
at java.net.InetAddress.getLocalHost(InetAddress.java:1178)
at LocalHostThread.run(LocalHostThread.java:10)
"Signal Dispatcher" daemon prio=10 tid=0x00a28498 nid=0x23b8 waiting on conditio
n [0..0]
"Finalizer" daemon prio=9 tid=0x009f0e28 nid=0x23a0 in Object.wait() [2baf000..2
bafd8c]
at java.lang.Object.wait(Native Method)
- waiting on <0x1053ed30> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:111)
- locked <0x1053ed30> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:127)
at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159)
"Reference Handler" daemon prio=10 tid=0x009ef9f8 nid=0x2398 in Object.wait() [2
b6f000..2b6fd8c]
at java.lang.Object.wait(Native Method)
- waiting on <0x1052edc8> (a java.lang.ref.Reference$Lock)
at java.lang.Object.wait(Object.java:429)
at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:115)
- locked <0x1052edc8> (a java.lang.ref.Reference$Lock)
"VM Thread" prio=5 tid=0x00a276f8 nid=0x2390 runnable
"VM Periodic Task Thread" prio=10 tid=0x00a2ac78 nid=0x2358 waiting on condition
"Suspend Checker Thread" prio=10 tid=0x00a38b48 nid=0x23a8 runnable
-------------
CUSTOMER'S INVESTIGATION :
They point out there is a problem at initSockFnTable()
in ./j2se/src/windows/hpi/src/socket_md.c.
=== Extracted source code ===
static void
initSockFnTable() {
int (PASCAL FAR* WSAStartupPtr)(WORD, LPWSADATA);
WSADATA wsadata;
OSVERSIONINFO info;
mutexInit(&sockFnTableMutex); <=====(a)
mutexLock(&sockFnTableMutex); <=====(b)
if (sockfnptrs_initialized == FALSE) {
...
// init process <==== (c)
...
}
sockfnptrs_initialized = TRUE;
mutexUnlock(&sockFnTableMutex); <=====(d)
}
=== Extracted source code end =====
Although the initSockFnTable() is called from somewhere in socket_md.c,
it is not locked exclusively.
When initSockFnTable() is invoked by several threads, several init process
of initSockFnTable() start simultaneously.
The followings are the scenario of above trace dump.
Suppose, there are 3 threads, A, B and C.
(1) Thread A and thread B call sysGetHostName in socket_md.c
and initSockFnTable() is called in order to initialize sockfnptr[] .
(2) Thread A executes the line (a) a little bit earlier than thread B.
(3) Thread B executes the line (a), B initializes sockFnTableMutex.
-> the information written by thread A is overwritten by thread B.
(4) Thread A executes the line (b), A can get the mutex and start to do the
init process of line (c).
(5) Thread B executes the line (a), B can not get mutex and enters into wait
status.
(6) The 3rd thread, C tries to execute sysGetHostName before thread A doesn't
finish the line (c).
Because thread C considers the initialization does not finish, C also
tries to execute initSockFnTable(). Then C executes the line (a).
-> The sockFnTableMutex is initialized twice at step (3) and step (6).
(7) When thread A finishes initializing at the line (c), thread A tries to
unlock the mutex at (d). However, because the mutex is rewritten
by thread C(and B), thread A can not unlock at the line (d).
So, thread B and C can not get mutex because the mutex is never unlocked.
PROBLEM and REQUEST:
The problem is that the sockFnTableMutex is initialized many times
by several threads.
SUN should implement that sockFnTableMutex is initialized only one time
in the same process even if there are a lot of threads at the same time.
==========================================================
- backported by
-
JDK-2139690 sockFnTableMutex should be initialized exclusively in socket_md.c
-
- Closed
-