It would be nice to have a command-line option to set the value returned by java.lang.Runtime.availableProcessors() and its Hotspot internal equivalent. Various things in the system, such as the number of threads in the fork-join common pool, and I believe also the number of GC threads and such, are derived from the number of "processors." For example, a SPARC T5-2 machine might have 32 cores with 8 threads each. The OS returns 256 from sysconf(_SC_NPROCESSORS_CONF) and this value is in turn reflected in Runtime.availableProcessors(). The resulting fork-join common pool size of 255 is based on this number.
This isn't wrong, but it's potentially misleading. It's unlikely that there will be a parallel speedup of anywhere near 256. For uniform workloads a baseline speedup of around 32x is more likely, since there are 32 "real" processors. For complex or non-uniform workloads, a speedup of greater than 32 is possible, because of interleaved usage of idle functional units on the processors. (This is the advantage of chip multithreading.)
Another way availableProcessors() can be misleading -- or be misused -- is by performance tests to scale their workload. Suppose a test generates a workload that's intended to run for one minute on a single CPU. It might multiply the workload by availableProcessors() in order to run for one wall clock minute on a multi-core machine. But if the workload is multiplied by 256, and there is only 32x parallelism, the benchmark will run for 8 minutes of wall clock time. This is clearly the wrong result.
The prevailing assumption of policies that use availableProcessors() is that they can use the entire resources of the system. This is a flawed assumption. Consider a case of running test jobs on this 32-core system. It might be configured to run (say) 12 jobs in parallel, in separate JVMs. But if each test scales itself so that it takes 8x wall clock time (as described above), the whole job will end up taking 96 times as long as expected in wall clock time.
I don't know what the right answer is. What would be helpful, though, is to allow a diagnostic option of some sort to alter the value returned by availableProcessors() and its equivalent internal interface. When misbehaviors occur on large multicore/multithreaded systems, then this option could be employed to try to learn more about the phenomenon.
This is a request for more of a diagnostic option than for some kind of API to expose number of "real" CPUs vs "virtual" CPUs, and such. That seems to be covered by JDK-5048379.
For information about multi-core vs multi-threaded architectures, see this Oracle white paper on the SPARC T5:
http://www.oracle.com/technetwork/server-storage/sun-sparc-enterprise/documentation/o13-024-sparc-t5-architecture-1920540.pdf
This isn't wrong, but it's potentially misleading. It's unlikely that there will be a parallel speedup of anywhere near 256. For uniform workloads a baseline speedup of around 32x is more likely, since there are 32 "real" processors. For complex or non-uniform workloads, a speedup of greater than 32 is possible, because of interleaved usage of idle functional units on the processors. (This is the advantage of chip multithreading.)
Another way availableProcessors() can be misleading -- or be misused -- is by performance tests to scale their workload. Suppose a test generates a workload that's intended to run for one minute on a single CPU. It might multiply the workload by availableProcessors() in order to run for one wall clock minute on a multi-core machine. But if the workload is multiplied by 256, and there is only 32x parallelism, the benchmark will run for 8 minutes of wall clock time. This is clearly the wrong result.
The prevailing assumption of policies that use availableProcessors() is that they can use the entire resources of the system. This is a flawed assumption. Consider a case of running test jobs on this 32-core system. It might be configured to run (say) 12 jobs in parallel, in separate JVMs. But if each test scales itself so that it takes 8x wall clock time (as described above), the whole job will end up taking 96 times as long as expected in wall clock time.
I don't know what the right answer is. What would be helpful, though, is to allow a diagnostic option of some sort to alter the value returned by availableProcessors() and its equivalent internal interface. When misbehaviors occur on large multicore/multithreaded systems, then this option could be employed to try to learn more about the phenomenon.
This is a request for more of a diagnostic option than for some kind of API to expose number of "real" CPUs vs "virtual" CPUs, and such. That seems to be covered by JDK-5048379.
For information about multi-core vs multi-threaded architectures, see this Oracle white paper on the SPARC T5:
http://www.oracle.com/technetwork/server-storage/sun-sparc-enterprise/documentation/o13-024-sparc-t5-architecture-1920540.pdf
- duplicates
-
JDK-8146115 Improve docker container detection and resource configuration usage
- Resolved