-
Enhancement
-
Resolution: Unresolved
-
P4
-
None
-
riscv
-
linux
Lets investigate and the new load-aqcuire and store-release.
Specs:
https://github.com/riscv/riscv-zalasr
As reference:
Table A.6 (Unpriv 2019), same as Table 55 in 2023, suggested a C/C++ mapping.
Table A.7 (Unpriv 2019), same as Table 56 in 2023, suggested a C/C++ with hypothetical mappings for future instructions "load with acquire/store with release.
These are in the soon to be ratified extension Zalasr.
The A.6 and A.7 are not compatible, meaning all binaries must use A.6 or A.7, i.e. do not mix them.
As not all Risc-V CPUs have Zalasr - only A.6 can be used to ship binaries, which in practice means it's not possible to use Zalasr/A.7.
The Risc-V ELF psABI addresses this with a new mapping, thus it is compatible with A.7 and both can be mixed.
If LR uses aqrl psABI is compatible with both A.6 and A.7, but it do pay the a bit of cost by having the trailing fence for atomic stores.
Atomic operation ----------------------------| Table A.6 (Table 55) ----------------------------| Riscv ELF psABI -----------------------------| Table A.7 (Table 56) Zalasr --------------------------------| Notes
-----------------------------------------------------|---------------------------------------------------------|----------------------------------------------------|----------------------------------------------------------------------|-------------------------
atomic_load(memory_order_seq_cst) -| fence rw,rw; l{b|h|w|d}; fence r,rw; ------| Same as A.6 ---------------------------------| <RCsc atomic load-acquire> (Incompatible A.6)---|
atomic_store(memory_order_seq_cst) | fence rw,w; s{b|h|w|d}; ----------------------| fence rw,w; s{b|h|w|d}; fence rw,rw;| <RCsc atomic store-release> ----------------------------| psABI mixed A.7
atomic_<op>(memory_order_seq_cst)| amo<op>.{w|d}.aqrl -------------------------| leading fence amocas for A.6 ----------| leading fence amocas for A.6 ----------------------------| Incompatibility A.6
atomic_<op>(memory_order_seq_cst)| L:lr.{w|d}.aqrl;<op>;sc.{w|d}.rl;bnez L | aq or aqrl for A.6 compatible -----------| lr.{w|d}.aq <op>; sc.{w|d}.rl (Incompatible A.6) |
Which means there are no Zalasr mappings compatiable with A.6/Table 55 mappings in specs.
Adding a trailing fence rw,rw to store-release as psABI adds to fence rw,w (release); s{b|h|w|d}; _* fence rw,rw; *_ would be equivalent.
gcc 13.3 and above uses psABI
gcc 13.2 and under uses 'custom'
clang 19 and above uses psABI
clang 18 and under uses A.6
This mean a plain A.7 implementation would require a Linux system compiled with gcc 13.3/clang 19 or newer - if there is any interaction with system libraries.
Interaction with the JVM from e.g. jitted code require the VM compiled with gcc 13.3/clang 19.
Specs:
https://github.com/riscv/riscv-zalasr
As reference:
Table A.6 (Unpriv 2019), same as Table 55 in 2023, suggested a C/C++ mapping.
Table A.7 (Unpriv 2019), same as Table 56 in 2023, suggested a C/C++ with hypothetical mappings for future instructions "load with acquire/store with release.
These are in the soon to be ratified extension Zalasr.
The A.6 and A.7 are not compatible, meaning all binaries must use A.6 or A.7, i.e. do not mix them.
As not all Risc-V CPUs have Zalasr - only A.6 can be used to ship binaries, which in practice means it's not possible to use Zalasr/A.7.
The Risc-V ELF psABI addresses this with a new mapping, thus it is compatible with A.7 and both can be mixed.
If LR uses aqrl psABI is compatible with both A.6 and A.7, but it do pay the a bit of cost by having the trailing fence for atomic stores.
Atomic operation ----------------------------| Table A.6 (Table 55) ----------------------------| Riscv ELF psABI -----------------------------| Table A.7 (Table 56) Zalasr --------------------------------| Notes
-----------------------------------------------------|---------------------------------------------------------|----------------------------------------------------|----------------------------------------------------------------------|-------------------------
atomic_load(memory_order_seq_cst) -| fence rw,rw; l{b|h|w|d}; fence r,rw; ------| Same as A.6 ---------------------------------| <RCsc atomic load-acquire> (Incompatible A.6)---|
atomic_store(memory_order_seq_cst) | fence rw,w; s{b|h|w|d}; ----------------------| fence rw,w; s{b|h|w|d}; fence rw,rw;| <RCsc atomic store-release> ----------------------------| psABI mixed A.7
atomic_<op>(memory_order_seq_cst)| amo<op>.{w|d}.aqrl -------------------------| leading fence amocas for A.6 ----------| leading fence amocas for A.6 ----------------------------| Incompatibility A.6
atomic_<op>(memory_order_seq_cst)| L:lr.{w|d}.aqrl;<op>;sc.{w|d}.rl;bnez L | aq or aqrl for A.6 compatible -----------| lr.{w|d}.aq <op>; sc.{w|d}.rl (Incompatible A.6) |
Which means there are no Zalasr mappings compatiable with A.6/Table 55 mappings in specs.
Adding a trailing fence rw,rw to store-release as psABI adds to fence rw,w (release); s{b|h|w|d}; _* fence rw,rw; *_ would be equivalent.
gcc 13.3 and above uses psABI
gcc 13.2 and under uses 'custom'
clang 19 and above uses psABI
clang 18 and under uses A.6
This mean a plain A.7 implementation would require a Linux system compiled with gcc 13.3/clang 19 or newer - if there is any interaction with system libraries.
Interaction with the JVM from e.g. jitted code require the VM compiled with gcc 13.3/clang 19.