Loading...

XML

Word

Printable

Type: Enhancement
Resolution: Fixed
Priority: P4
Fix Version/s: 19
Affects Version/s: 11, 17, 18, 19
Component/s: hotspot
Labels:
- amazon-interest
- redhat-interest

Subcomponent:
compiler
Resolved In Build:
b05
CPU:

x86_64

In copy_bytes_forward and copy_bytes_backward that are used in arraycopy stubs, we have code like:

      if (UseAVX >= 2) {
        // clean upper bits of YMM registers
        __ vpxor(xmm0, xmm0);
        __ vpxor(xmm1, xmm1);
      }

This code was added by ~~JDK-8011102~~ (with vzeroupper), and then changed by ~~JDK-8078113~~ (changed to vpxor).
It raised some questions during early ~~JDK-8279621~~ review.

I believe these were added to resolve false dependencies from larger 256-bit registers with subsequent 128-bit-using instructions.

Note: this is still insufficient on Intel x86 implementations to recover from "dirty" AVX state; only vzeroupper/vzeroall would solve that, but that issue might not even affect our assembler code that AFAICS uses VEX-encoded versions when AVX > 0 (see Assembler::simd_prefix_and_encode). Every arraycopy stub has vzeroupper at the end, anyhow.

For x86_64 version, this zeroing seems redundant, as there are no XMM-using instructions after we leave the copy_bytes_{forward,backward} and go to stub epilog, where we meet vzeroupper.

For x86_32 version, this zeroing seems odd. x86_32 qword copying still uses XMM registers, as 32-bit platform has no other good way to copy 8 bytes at a time. There, using VEX.256 vpxor clears all bits, which is fine right now, but ~~JDK-8279621~~ changes would probably need to clear upper 128-bits in AVX=1 mode. Also, it is only enabled for AVX==2, ignoring AVX-512.

Draft PR:
https://github.com/openjdk/jdk/pull/7016

blocks

JDK-8279621 x86_64 arraycopy stubs should use 256-bit copies with AVX=1

Closed

relates to

JDK-8178811 Minimize the AVX <-> SSE transition penalty through generation of vzeroupper instruction on x86

Resolved

JDK-8078113 8011102 changes may cause incorrect results.

Resolved

JDK-8011102 Clear AVX registers after return from JNI call

Resolved

JDK-8279621 x86_64 arraycopy stubs should use 256-bit copies with AVX=1

Closed

links to

Commit openjdk/jdk/525b20fc

Review openjdk/jdk/7016

(2 links to)

Assignee:: Aleksey Shipilev

Reporter:: Aleksey Shipilev

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Created:: 2022-01-10 03:18

Updated:: 2023-05-17 12:56

Resolved:: 2022-01-12 00:34

Details

Description

Attachments

Issue Links

Activity

People

Dates