-
Enhancement
-
Resolution: Won't Fix
-
P4
-
None
-
hs19
-
generic
-
generic
It's currently done by the VM thread, single-threaded. This can become a scalability
bottleneck in large heap, CMT configurations. A recent measurement with compressed
oops (when the promoted links need to be decompressed, contributing additional overhead)
showed that the cost could, when the promotion volume is high, be fairly substantial.
Amdahl would certainly start rearing his head on our larger CMT boxes.
The hope is to take the cost down from O(n), n = promotion volume,
down to O(n/k), k = # workers.
Since this can be done only after termination has been reached,
if promotion volume is low then the overhead of parallelization
might exceed the benefits from it, so we may want to dynamically
do it one way or another based on cost, which should be measured.
At any rate getting rid of a potential scalability bottleneck would
definitely be a good thing.
bottleneck in large heap, CMT configurations. A recent measurement with compressed
oops (when the promoted links need to be decompressed, contributing additional overhead)
showed that the cost could, when the promotion volume is high, be fairly substantial.
Amdahl would certainly start rearing his head on our larger CMT boxes.
The hope is to take the cost down from O(n), n = promotion volume,
down to O(n/k), k = # workers.
Since this can be done only after termination has been reached,
if promotion volume is low then the overhead of parallelization
might exceed the benefits from it, so we may want to dynamically
do it one way or another based on cost, which should be measured.
At any rate getting rid of a potential scalability bottleneck would
definitely be a good thing.