-
Bug
-
Resolution: Unresolved
-
P4
-
None
-
None
When multi-threaded benchmark runs, it computes the operation time and counts for each thread in isolation. This implicitly relies on idea that the execution of all thread tasks would overlap. (Synchronize iterations code also makes sure that at the edges the benchmark threads have the parallel "dummy" computations).
This almost always works for platform threads, because schedulers maintain some fairness in resource allocation for running threads. However, this is reliably not so for VIRTUAL executor. There, trying to run N threads on M cpus (N > M) would effectively run @Benchmark in batches of M. The aggregation code would then come in and sum all these operations, which overestimates the throughput by the factor of N/M.
This can technically happen with platform threads as well, if OS scheduler is not fair.
We need to make JMH more reliable in these cases.
This almost always works for platform threads, because schedulers maintain some fairness in resource allocation for running threads. However, this is reliably not so for VIRTUAL executor. There, trying to run N threads on M cpus (N > M) would effectively run @Benchmark in batches of M. The aggregation code would then come in and sum all these operations, which overestimates the throughput by the factor of N/M.
This can technically happen with platform threads as well, if OS scheduler is not fair.
We need to make JMH more reliable in these cases.