-
Enhancement
-
Resolution: Won't Fix
-
P4
-
None
-
generic
The specjvm::serial workload test serialization/ deserialization. The performance depends on vCPU and drops with high vCPU numbers. The reason for this drop is methods Symbol::try_increment_refcount() and Symbol::decrement_refcount().
Reported numbers:
192vCPU: score=8444, CPU time for try_increment_refcount() ~0.3%, decrement_refcount() ~0.6%
384vCPU: score=5216, CPU time for try_increment_refcount() ~10.5%, decrement_refcount() ~9.8%
Internally these methods implemented as infinite cycle for Atomic::cmpxchg:
void Symbol::decrement_refcount() {
uint32_t found = _hash_and_refcount;
while (true) {
uint32_t old_value = found;
...
} else {
found = Atomic::cmpxchg(&_hash_and_refcount, old_value, old_value - 1);
if (found == old_value) {
return; // successfully updated.
}
// refcount changed, try again.
}
}
}
For the highly competitive runs it make sense to convert it to real locks to reduce CPU utilization.
Reported numbers:
192vCPU: score=8444, CPU time for try_increment_refcount() ~0.3%, decrement_refcount() ~0.6%
384vCPU: score=5216, CPU time for try_increment_refcount() ~10.5%, decrement_refcount() ~9.8%
Internally these methods implemented as infinite cycle for Atomic::cmpxchg:
void Symbol::decrement_refcount() {
uint32_t found = _hash_and_refcount;
while (true) {
uint32_t old_value = found;
...
} else {
found = Atomic::cmpxchg(&_hash_and_refcount, old_value, old_value - 1);
if (found == old_value) {
return; // successfully updated.
}
// refcount changed, try again.
}
}
}
For the highly competitive runs it make sense to convert it to real locks to reduce CPU utilization.