-
Enhancement
-
Resolution: Fixed
-
P4
-
24
-
b26
-
riscv
-
linux
We fill it by a single 8-byte store when the remaining count is less than 8 bytes after `fill_words`. This may
overwrite some elements and create misaligned access. While it's not an issue for mordern CPUs with fast misaligned
access, this does affect performance on CPUs where misaligned accesses are emulated by a trap handler and thus is
very slow. async-profiler tells 2.8% cpu of `jshort_fill` in flame graph when sampling Specjbb2005 on these platforms.
In this particular case, the copy address `to` is 8-byte aligned after `fill_words`. So if `AvoidUnalignedAccesses`
is true, one choice would be directing controlto `L_fill_elements` which avoids alignment issue while filling the
remaining elements.
overwrite some elements and create misaligned access. While it's not an issue for mordern CPUs with fast misaligned
access, this does affect performance on CPUs where misaligned accesses are emulated by a trap handler and thus is
very slow. async-profiler tells 2.8% cpu of `jshort_fill` in flame graph when sampling Specjbb2005 on these platforms.
In this particular case, the copy address `to` is 8-byte aligned after `fill_words`. So if `AvoidUnalignedAccesses`
is true, one choice would be directing controlto `L_fill_elements` which avoids alignment issue while filling the
remaining elements.
- links to
-
Commit(master) openjdk/jdk/5e0d42b6
-
Review(master) openjdk/jdk/22347