-
Enhancement
-
Resolution: Fixed
-
P3
-
11, 17
-
b11
-
aarch64
Issue | Fix Version | Assignee | Priority | Status | Resolution | Resolved In Build |
---|---|---|---|---|---|---|
JDK-8263877 | 11.0.12 | Andrew Haley | P3 | Resolved | Fixed | b01 |
Now that we have support for LSE atomics in C++ HotSpot source, we can generate much better code for them. In particular, the sequence we generate for CMPXCHG with a full two-way barrier using two DMBs is way suboptimal.
Barrier-ordered-before, Arm Architecture Reference Manual B2.3 :
| Barrier instructions order prior Memory effects before subsequent
| Memory effects generated by the same Observer. A read or a write RW1
| is Barrier-ordered-before a read or a write RW2 from the same Observer
| if and only if RW1 appears in program order before RW2 and any of the
| following cases apply:
|
| [...]
|
| * RW1 appears in program order before an atomic instruction with both
| Acquire and Release semantics that appears in program order before RW2.
So a prior load or store cannot be reordered with the load of an atomic swap with Acquire and Release semantics. This barrier-ordered-before in combination with sequential consistency gives us everything we need for a full barrier. However, we still need a DMB after the cmpxchg to ensure that subsequent loads and stores cannot be reordered with the store in an atomic instruction.
Barrier-ordered-before, Arm Architecture Reference Manual B2.3 :
| Barrier instructions order prior Memory effects before subsequent
| Memory effects generated by the same Observer. A read or a write RW1
| is Barrier-ordered-before a read or a write RW2 from the same Observer
| if and only if RW1 appears in program order before RW2 and any of the
| following cases apply:
|
| [...]
|
| * RW1 appears in program order before an atomic instruction with both
| Acquire and Release semantics that appears in program order before RW2.
So a prior load or store cannot be reordered with the load of an atomic swap with Acquire and Release semantics. This barrier-ordered-before in combination with sequential consistency gives us everything we need for a full barrier. However, we still need a DMB after the cmpxchg to ensure that subsequent loads and stores cannot be reordered with the store in an atomic instruction.
- backported by
-
JDK-8263877 AArch64: Optimize LSE atomics in C++ code
- Resolved
- is blocked by
-
JDK-8261027 AArch64: Support for LSE atomics C++ HotSpot code
- Resolved
- relates to
-
JDK-8263541 Potential race in 8261027: AArch64: Support for LSE atomics C++ HotSpot code
- Closed
-
JDK-8261027 AArch64: Support for LSE atomics C++ HotSpot code
- Resolved
(2 links to)