Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8279676

Dubious YMM register clearing in x86_64 arraycopy stubs

XMLWordPrintable

    • b05
    • x86_64

      In copy_bytes_forward and copy_bytes_backward that are used in arraycopy stubs, we have code like:

            if (UseAVX >= 2) {
              // clean upper bits of YMM registers
              __ vpxor(xmm0, xmm0);
              __ vpxor(xmm1, xmm1);
            }

      This code was added by JDK-8011102 (with vzeroupper), and then changed by JDK-8078113 (changed to vpxor).
      It raised some questions during early JDK-8279621 review.

      I believe these were added to resolve false dependencies from larger 256-bit registers with subsequent 128-bit-using instructions.

      Note: this is still insufficient on Intel x86 implementations to recover from "dirty" AVX state; only vzeroupper/vzeroall would solve that, but that issue might not even affect our assembler code that AFAICS uses VEX-encoded versions when AVX > 0 (see Assembler::simd_prefix_and_encode). Every arraycopy stub has vzeroupper at the end, anyhow.

      For x86_64 version, this zeroing seems redundant, as there are no XMM-using instructions after we leave the copy_bytes_{forward,backward} and go to stub epilog, where we meet vzeroupper.

      For x86_32 version, this zeroing seems odd. x86_32 qword copying still uses XMM registers, as 32-bit platform has no other good way to copy 8 bytes at a time. There, using VEX.256 vpxor clears all bits, which is fine right now, but JDK-8279621 changes would probably need to clear upper 128-bits in AVX=1 mode. Also, it is only enabled for AVX==2, ignoring AVX-512.

      Draft PR:
       https://github.com/openjdk/jdk/pull/7016

            shade Aleksey Shipilev
            shade Aleksey Shipilev
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: