Loading...

XML

Word

Printable

Type: Enhancement
Resolution: Fixed
Priority: P3
Fix Version/s: 17
Affects Version/s: 11, 17
Component/s: hotspot
Labels:
- jdk11u-fix-request
- jdk11u-fix-yes

Subcomponent:
compiler
Resolved In Build:
b10
CPU:

aarch64
OS:

generic

Issue	Fix Version	Assignee	Priority	Status	Resolution	Resolved In Build
JDK-8263876	11.0.12	Andrew Haley	P3	Resolved	Fixed	b01

Go back a few years, and there were simple atomic load/store exclusive
instructions on Arm. Say you want to do an atomic increment of a
counter. You'd do an atomic load to get the counter into your local cache
in exclusive state, increment that counter locally, then write that
incremented counter back to memory with an atomic store. All the time
that cache line was in exclusive state, so you're guaranteed that
no-one else changed anything on that cache line while you had it.

This is hard to scale on a very large system (e.g. Fugaku) because if
many processors are incrementing that counter you get a lot of cache
line ping-ponging between cores.

So, Arm decided to add a locked memory increment instruction that
works without needing to load an entire line into local cache. It's a
single instruction that loads, increments, and writes back. The secret
is to send a cache control message to whichever processor owns the
cache line containing the count, tell that processor to increment the
counter and return the incremented value. That way cache coherency
traffic is mimimized. This new set of instructions is known as Large
System Extensions, or LSE.

Unfortunately, in recent processors, the "old" load/store exclusive
instructions, sometimes perform very badly. Therefore, it's now
necessary for software to detect which version of Arm it's running
on, and use the "new" LSE instructions if they're available. Otherwise
performance can be very poor under heavy contention.

GCC's -moutline-atomics does this by providing library calls which use
LSE if it's available, but this option is only provided on newer
versions of GCC. This is particularly problematic with older versions
of OpenJDK, which build using old GCC versions.

Also, I suspect that some other operating systems could use this.
Perhaps not MacOS, given that all Apple CPUs support LSE, but
maybe Windows.

backported by

JDK-8263876 AArch64: Support for LSE atomics C++ HotSpot code

Resolved

blocks

JDK-8261649 AArch64: Optimize LSE atomics in C++ code

Resolved

relates to

JDK-8261659 JDK-8261027 causes a Tier1 validate-source failure

Closed

JDK-8263541 Potential race in 8261027: AArch64: Support for LSE atomics C++ HotSpot code

Closed

JDK-8261649 AArch64: Optimize LSE atomics in C++ code

Resolved

JDK-8261660 AArch64: Race condition in stub code generation for LSE Atomics

Closed

links to

Commit openjdk/jdk/40ae9937

Review openjdk/jdk/2434

(1 relates to, 2 links to)

Assignee:: Andrew Haley

Reporter:: Andrew Haley

Votes:: 0 Vote for this issue

Watchers:: 8 Start watching this issue

Created:: 2021-02-03 02:31

Updated:: 2025-01-16 11:48

Resolved:: 2021-02-12 05:12

Details

Backports

Description

Attachments

Issue Links

Activity

People

Dates