We have been seeing some failures in CKI project, where the kernels on some virtual machines errored out on attempting to do taskset on non-existing CPUs.
The root cause is that fallback topology is used on those nodes, since some /proc or /sysfs data made little sense. Fallback topology faked some (consecutive) CPU numbers, and then jcstress used taskset to hook forked JVMs to it. But that is not guaranteed to work: the CPU numbers are not guaranteed to be consecutive.
When using fallback topology, we should force NONE affinity mode.
The root cause is that fallback topology is used on those nodes, since some /proc or /sysfs data made little sense. Fallback topology faked some (consecutive) CPU numbers, and then jcstress used taskset to hook forked JVMs to it. But that is not guaranteed to work: the CPU numbers are not guaranteed to be consecutive.
When using fallback topology, we should force NONE affinity mode.
- links to
-
Review openjdk/jcstress/106