-
Bug
-
Resolution: Duplicate
-
P2
-
None
-
1.3.0
-
sparc
-
solaris_8
This is a regression from the 1.2.2_06 Solaris_VM (aka Exact VM).
The thread concurrency is not being set correctly. Bug 4019166, which
was filed against an early version of the Solaris JVM, describes this
problem in detail. It appears that the HotSpot VM for Solaris has
reintroduced this bug.
To reproduce this bug, unjar the attached file and run the test program
on a system with at least 8 CPUs as follows:
setenv LD_LIBRARY_PATH .
java MultiThreadTest
This will create and start a master thread and 8 independent compute
threads. Every 5 seconds the main thread prints out the system thread
concurrency. You will notice that on JDK 1.3 and JDK 1.4 the thread
concurrency is set to 3 or 4 (it seems to be somewhat random). This
means that the main thread, the controlling thread and one or two of
the compute threads are able to run. Most of the compute threads are
starved.
Here is the output from JDK1.2.2_06 (Solaris):
caffeine% java -version
java version "1.2.2"
Solaris VM (build Solaris_JDK_1.2.2_06, native threads, sunwjit)
caffeine% java MultiThreadTest
Thread concurrency = 5
Creating 8 computing threads
masterThread.run
computeThread[6].run
computeThread[3].run
computeThread[7].run
computeThread[1].run
computeThread[5].run
computeThread[2].run
computeThread[4].run
computeThread[0].run
compute[1] finished pass 1
compute[5] finished pass 1
compute[0] finished pass 1
compute[4] finished pass 1
compute[2] finished pass 1
compute[6] finished pass 1
compute[3] finished pass 1
compute[7] finished pass 1
Thread concurrency = 14
NOTE that the thread concurrency is up to 14 after all of the threads are
created and running.
Here is the output from JDK 1.3:
caffeine% java -version
java version "1.3.0"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.3.0)
Java HotSpot(TM) Client VM (build 1.3.0, mixed mode)
caffeine% java MultiThreadTest
Creating 8 computing threads
masterThread.run
computeThread[0].run
computeThread[1].run
computeThread[2].run
computeThread[3].run
computeThread[4].run
computeThread[5].run
computeThread[6].run
computeThread[7].run
Thread concurrency = 3
Thread concurrency = 3
Thread concurrency = 3
...
Thread concurrency = 3
compute[1] finished pass 1
Note that the compute threads are not making much progress. The
problem is even more pronounced with more threads. To run the program
with more threads:
java MultiThreadTest -n 20
This will create 20 compute threads rather than the default of 8.
The server hotspot seems to do quite a bit better after it warms up,
although there are some inexplicable drops in CPU utilization as shown
with "perfbar".
--------------------------------------------------------------------------
kevin.rushforth@Eng 2000-11-21
Here is some more info on this. I tried the workaround suggested by
Tim Cramer (the -XX:+UseLWPSynchronization option). It seems to work
as far as setting the number of concurrent threads goes. I am still
seeing a strange result with the server VM if I run my independent
compute-intensive threads without yielding periodically.
I have attached a new version of the MultiThreadTest.java program
that replaces the one in the jar file. In this version, the
master thread times the results of 10 full passes of all compute
threads.
A static final variable, yieldDuringEachPass, controls whether each
compute thread yields frequently during each compute pass (by default,
I have set this variable to false, meaning it will only yield at the
end of each full pass).
If the yieldDuringEachPass variable is false, then the thread
scheduling behavior of both client and server hotspot VMs is
suboptimal in the absense of "-XX:+UseLWPSynchronization". You can
see this both by looking at a perfbar (I have attached a copy of
perfbar, in case you don't have one) and by looking at the completion
messages for each compute thread and by looking at the performance
results. With the "-XX:+UseLWPSynchronization" flag set, the client
VM behaves as expected, but the server VM is still suboptimal.
If the yieldDuringEachPass variable is set to true, then the thread
scheduling behavior for the suboptimal cases improves (although there
is a small amount of overhead associated with calling Thread.yield()
so often). This suggests that there is a problem with thread
scheduling on the server VM even with the "-XX:+UseLWPSynchronization"
set.
To reproduce this, run the program as follows:
java [-server] [-XX:+UseLWPSynchronization] MultiThreadTest [-n 20]
You will need to change the yieldDuringEachPass variable to true for
the second column of performance numbers.
Here is a performance table for a set of runs done on an 8-way system
in our lab. The machine name is caffeine.eng. If you have trouble
finding an 8-way machine, let me know and you can borrow cycles on ours.
System configuration
--------------------
hostname = caffeine
OS = Solaris 8
Num CPUs = 8
CPU speed = 336 Mhz
All performance numbers are in seconds (smaller is better).
8 compute threads w/o the LWP flag
Yield during each pass
false true
------- -------
1.3 client 437.283 376.817
1.3 server 57.687 45.748
1.2.2_06 (Solaris_VM) 41.301 42.778
Note that the client
8 compute threads with the LWP flag
Yield during each pass
false true
------- -------
1.3 client w/LWP flag 47.495 48.963
1.3 server w/LWP flag 46.816 43.245
1.2.2_06 (Solaris_VM) 41.301 42.778
20 compute threads (-n 20) with the LWP flag
Yield during each pass
false true
------- -------
1.3 client w/LWP flag 118.375 122.612
1.3 server w/LWP flag 129.398 107.824
1.2.2_06 (Solaris_VM) 102.815 106.887
The thread concurrency is not being set correctly. Bug 4019166, which
was filed against an early version of the Solaris JVM, describes this
problem in detail. It appears that the HotSpot VM for Solaris has
reintroduced this bug.
To reproduce this bug, unjar the attached file and run the test program
on a system with at least 8 CPUs as follows:
setenv LD_LIBRARY_PATH .
java MultiThreadTest
This will create and start a master thread and 8 independent compute
threads. Every 5 seconds the main thread prints out the system thread
concurrency. You will notice that on JDK 1.3 and JDK 1.4 the thread
concurrency is set to 3 or 4 (it seems to be somewhat random). This
means that the main thread, the controlling thread and one or two of
the compute threads are able to run. Most of the compute threads are
starved.
Here is the output from JDK1.2.2_06 (Solaris):
caffeine% java -version
java version "1.2.2"
Solaris VM (build Solaris_JDK_1.2.2_06, native threads, sunwjit)
caffeine% java MultiThreadTest
Thread concurrency = 5
Creating 8 computing threads
masterThread.run
computeThread[6].run
computeThread[3].run
computeThread[7].run
computeThread[1].run
computeThread[5].run
computeThread[2].run
computeThread[4].run
computeThread[0].run
compute[1] finished pass 1
compute[5] finished pass 1
compute[0] finished pass 1
compute[4] finished pass 1
compute[2] finished pass 1
compute[6] finished pass 1
compute[3] finished pass 1
compute[7] finished pass 1
Thread concurrency = 14
NOTE that the thread concurrency is up to 14 after all of the threads are
created and running.
Here is the output from JDK 1.3:
caffeine% java -version
java version "1.3.0"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.3.0)
Java HotSpot(TM) Client VM (build 1.3.0, mixed mode)
caffeine% java MultiThreadTest
Creating 8 computing threads
masterThread.run
computeThread[0].run
computeThread[1].run
computeThread[2].run
computeThread[3].run
computeThread[4].run
computeThread[5].run
computeThread[6].run
computeThread[7].run
Thread concurrency = 3
Thread concurrency = 3
Thread concurrency = 3
...
Thread concurrency = 3
compute[1] finished pass 1
Note that the compute threads are not making much progress. The
problem is even more pronounced with more threads. To run the program
with more threads:
java MultiThreadTest -n 20
This will create 20 compute threads rather than the default of 8.
The server hotspot seems to do quite a bit better after it warms up,
although there are some inexplicable drops in CPU utilization as shown
with "perfbar".
--------------------------------------------------------------------------
kevin.rushforth@Eng 2000-11-21
Here is some more info on this. I tried the workaround suggested by
Tim Cramer (the -XX:+UseLWPSynchronization option). It seems to work
as far as setting the number of concurrent threads goes. I am still
seeing a strange result with the server VM if I run my independent
compute-intensive threads without yielding periodically.
I have attached a new version of the MultiThreadTest.java program
that replaces the one in the jar file. In this version, the
master thread times the results of 10 full passes of all compute
threads.
A static final variable, yieldDuringEachPass, controls whether each
compute thread yields frequently during each compute pass (by default,
I have set this variable to false, meaning it will only yield at the
end of each full pass).
If the yieldDuringEachPass variable is false, then the thread
scheduling behavior of both client and server hotspot VMs is
suboptimal in the absense of "-XX:+UseLWPSynchronization". You can
see this both by looking at a perfbar (I have attached a copy of
perfbar, in case you don't have one) and by looking at the completion
messages for each compute thread and by looking at the performance
results. With the "-XX:+UseLWPSynchronization" flag set, the client
VM behaves as expected, but the server VM is still suboptimal.
If the yieldDuringEachPass variable is set to true, then the thread
scheduling behavior for the suboptimal cases improves (although there
is a small amount of overhead associated with calling Thread.yield()
so often). This suggests that there is a problem with thread
scheduling on the server VM even with the "-XX:+UseLWPSynchronization"
set.
To reproduce this, run the program as follows:
java [-server] [-XX:+UseLWPSynchronization] MultiThreadTest [-n 20]
You will need to change the yieldDuringEachPass variable to true for
the second column of performance numbers.
Here is a performance table for a set of runs done on an 8-way system
in our lab. The machine name is caffeine.eng. If you have trouble
finding an 8-way machine, let me know and you can borrow cycles on ours.
System configuration
--------------------
hostname = caffeine
OS = Solaris 8
Num CPUs = 8
CPU speed = 336 Mhz
All performance numbers are in seconds (smaller is better).
8 compute threads w/o the LWP flag
Yield during each pass
false true
------- -------
1.3 client 437.283 376.817
1.3 server 57.687 45.748
1.2.2_06 (Solaris_VM) 41.301 42.778
Note that the client
8 compute threads with the LWP flag
Yield during each pass
false true
------- -------
1.3 client w/LWP flag 47.495 48.963
1.3 server w/LWP flag 46.816 43.245
1.2.2_06 (Solaris_VM) 41.301 42.778
20 compute threads (-n 20) with the LWP flag
Yield during each pass
false true
------- -------
1.3 client w/LWP flag 118.375 122.612
1.3 server w/LWP flag 129.398 107.824
1.2.2_06 (Solaris_VM) 102.815 106.887