-
Type:
Enhancement
-
Resolution: Unresolved
-
Priority:
P4
-
None
-
Affects Version/s: None
-
Component/s: hotspot
Noticed this while looking at Late Barrier Expansion work.
Shenandoah C2 clone barrier is inserted before calling into arraycopy stub:
void ShenandoahBarrierSetC2::clone_at_expansion(PhaseMacroExpand* phase, ArrayCopyNode* ac) const {
...
// Heap is unstable, call into clone barrier stub
Node* call = phase->make_leaf_call(unstable_ctrl, mem,
ShenandoahBarrierSetC2::clone_barrier_Type(),
CAST_FROM_FN_PTR(address, ShenandoahRuntime::clone_barrier),
"shenandoah_clone",
TypeRawPtr::BOTTOM,
src_base);
call = phase->transform_later(call);
...
// Wire up the actual arraycopy stub now
ctrl = phase->transform_later(region);
mem = phase->transform_later(mem_phi);
const char* name = "arraycopy";
call = phase->make_leaf_call(ctrl, mem,
OptoRuntime::fast_arraycopy_Type(),
phase->basictype2arraycopy(T_LONG, nullptr, nullptr, true, name, true),
name, TypeRawPtr::BOTTOM,
src, dest, length
LP64_ONLY(COMMA phase->top()));
call = phase->transform_later(call);
The following arraycopy call is doing T_LONG copy. But this dance looks unnecessary, because arraycopy stub *itself* calls into BarrierSetAssembler::arraycopy_prologue, which Shenandoah handles! See:
address StubGenerator::generate_conjoint_copy_avx3_masked(StubId stub_id, address* entry, address nooverlap_target) {
...
BarrierSetAssembler *bs = BarrierSet::barrier_set()->barrier_set_assembler();
bs->arraycopy_prologue(_masm, decorators, type, from, to, count);
There are also other ways GCs handle the clones, including having the full runtime clone routine that applies both relevant clone barriers and do the copy.
Shenandoah C2 clone barrier is inserted before calling into arraycopy stub:
void ShenandoahBarrierSetC2::clone_at_expansion(PhaseMacroExpand* phase, ArrayCopyNode* ac) const {
...
// Heap is unstable, call into clone barrier stub
Node* call = phase->make_leaf_call(unstable_ctrl, mem,
ShenandoahBarrierSetC2::clone_barrier_Type(),
CAST_FROM_FN_PTR(address, ShenandoahRuntime::clone_barrier),
"shenandoah_clone",
TypeRawPtr::BOTTOM,
src_base);
call = phase->transform_later(call);
...
// Wire up the actual arraycopy stub now
ctrl = phase->transform_later(region);
mem = phase->transform_later(mem_phi);
const char* name = "arraycopy";
call = phase->make_leaf_call(ctrl, mem,
OptoRuntime::fast_arraycopy_Type(),
phase->basictype2arraycopy(T_LONG, nullptr, nullptr, true, name, true),
name, TypeRawPtr::BOTTOM,
src, dest, length
LP64_ONLY(COMMA phase->top()));
call = phase->transform_later(call);
The following arraycopy call is doing T_LONG copy. But this dance looks unnecessary, because arraycopy stub *itself* calls into BarrierSetAssembler::arraycopy_prologue, which Shenandoah handles! See:
address StubGenerator::generate_conjoint_copy_avx3_masked(StubId stub_id, address* entry, address nooverlap_target) {
...
BarrierSetAssembler *bs = BarrierSet::barrier_set()->barrier_set_assembler();
bs->arraycopy_prologue(_masm, decorators, type, from, to, count);
There are also other ways GCs handle the clones, including having the full runtime clone routine that applies both relevant clone barriers and do the copy.