-
Bug
-
Resolution: Duplicate
-
P4
-
15
-
aarch64
Several Renassiance benchmark hang on AArch64 with the VM option "-XX:+UseBarriersForVolatile". It happens on two types of AArch64 platforms from different partners. After analysing the cause, we found it's quite related to the memory sequentially consistent issue mentioned in JDK-8179954.
To resolve the issue caused by mixing use "LDR, DMB" for volatile load and "STLR/STLXR" for volatile store, it inserts a full barrier before the volatile load in C1/Interpreter. However it only happens with the condition "UseBarriersForVolatile == false".
With option "-XX:+UseBarriersForVolatile", C2 inserts a leading full barrier and a trailing "LOAD_LOAD | LOAD_STORE" barrier for atomics.
Eg: Codes generated for CAS:
dmb ish
retry:
ldxr w0, [address]
cmp w0, w1
b.ne done
stlxr w8, w2, [address]
cbnz w8, retry
done:
dmb ishld
The trailing "LOAD_LOAD | LOAD_STORE" barrier cannot guarantee the memory consistent between "STLXR" and the subsequent "LDR" for volatile load. It needs a full barrier here to make sure the memory consistent. As a result, both inserting a full barrier before volatile load in C1/Interpreter (removing the "UseBarriersForVolatile" checking) and changing the trailing barrier of atomic to "MemBarVolatile" can resolve the hang issue on one machine.
Except for the atomics, there are any other usages of "STLR / STLXR". This might make the hang issue happen on another arm machine. I'm not sure whether it's the same issue withJDK-8179954. But the hang issue can also be resolved by inserting a full barrier before volatile load in interpreter (codes in "./src/hotspot/cpu/aarch64/templateTable_aarch64.cpp"). Unfortunately I have not yet found the key codes that generating the wrong reorder.
So do we actually need to consider the VM option "-XX:+UseBarriersForVolatile"? If it is needed and used somewhere, I think it's valuable to fix the issue.
To resolve the issue caused by mixing use "LDR, DMB" for volatile load and "STLR/STLXR" for volatile store, it inserts a full barrier before the volatile load in C1/Interpreter. However it only happens with the condition "UseBarriersForVolatile == false".
With option "-XX:+UseBarriersForVolatile", C2 inserts a leading full barrier and a trailing "LOAD_LOAD | LOAD_STORE" barrier for atomics.
Eg: Codes generated for CAS:
dmb ish
retry:
ldxr w0, [address]
cmp w0, w1
b.ne done
stlxr w8, w2, [address]
cbnz w8, retry
done:
dmb ishld
The trailing "LOAD_LOAD | LOAD_STORE" barrier cannot guarantee the memory consistent between "STLXR" and the subsequent "LDR" for volatile load. It needs a full barrier here to make sure the memory consistent. As a result, both inserting a full barrier before volatile load in C1/Interpreter (removing the "UseBarriersForVolatile" checking) and changing the trailing barrier of atomic to "MemBarVolatile" can resolve the hang issue on one machine.
Except for the atomics, there are any other usages of "STLR / STLXR". This might make the hang issue happen on another arm machine. I'm not sure whether it's the same issue with
So do we actually need to consider the VM option "-XX:+UseBarriersForVolatile"? If it is needed and used somewhere, I think it's valuable to fix the issue.
- duplicates
-
JDK-8243339 AArch64: Obsolete UseBarriersForVolatile option
- Resolved
- relates to
-
JDK-8242469 [aarch64] Remove UseBarriersForVolatile support for T88 variant 0
- Closed
-
JDK-8179954 AArch64: C1 and C2 volatile accesses are not sequentially consistent
- Closed