Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8190800

Support vectorization of Math.sqrt() on floats

    XMLWordPrintable

Details

    • b36
    • x86_64
    • linux

    Description

      FULL PRODUCT VERSION :
      java version "9.0.1"
      Java(TM) SE Runtime Environment (build 9.0.1+11)
      Java HotSpot(TM) 64-Bit Server VM (build 9.0.1+11, mixed mode)


      FULL OS VERSION :
      Linux pnod0337 3.10.0-514.6.1.el7.x86_64 #1 SMP Wed Jan 18 13:06:36 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

      EXTRA RELEVANT SYSTEM CONFIGURATION :
      Intel(R) Xeon Phi(TM) CPU 7210 @ 1.30GHz


      A DESCRIPTION OF THE PROBLEM :
      Compiled (c2) method output is smart enough to detect that it needs to use float version of the instruction but it does not vectorize the loop in this case (see the code)

      When running the code as:
      java -XX:UseAVX=3 -XX:CompileCommand=print,Sqrt.* Sqrt>asm_sqrt.txt

      sqrtDouble is correctly vectorized and uses ZMM registers:
      0x00007fc311924889: vsqrtpd 0x50(%rbx,%r10,8),%zmm0{%k1}{z}
      0x00007fc311924894: vmovdqu64 %zmm0,0x50(%rbx,%r10,8){%k1}

      However sqrtFloat does not get vectorized and uses scalar version of the instruction:
      0x00007fc311925a20: vsqrtss 0x10(%rdx,%r8,4),%xmm1,%xmm1{%k1}{z}
      0x00007fc311925a28: vmovss %xmm1,0x10(%rdx,%r8,4){%k1}

      THE PROBLEM WAS REPRODUCIBLE WITH -Xint FLAG: Did not try

      THE PROBLEM WAS REPRODUCIBLE WITH -server FLAG: Did not try

      STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
      Compile the attached source code with :
      java -XX:UseAVX=3 -XX:CompileCommand=print,Sqrt.* Sqrt>asm_sqrt.txt
      Observe the assembly for (C2) compiled methods sqrtFloat and sqrtDouble.



      EXPECTED VERSUS ACTUAL BEHAVIOR :
      Expected results would be to use vsqrtps instruction in vectorized loop.
      Actual is that vsqrtss is used instead and the loop is not vectorized.
      REPRODUCIBILITY :
      This bug can be reproduced always.

      ---------- BEGIN SOURCE ----------
      class Sqrt {
          private void sqrtDouble(final double[] samples) {
                  for (int i = 0; i < samples.length; i++) {
                          samples[i] = Math.sqrt(samples[i]);
              }
          }
          private void sqrtFloat(final float[] samples) {
                  for (int i = 0; i < samples.length; i++) {
                          samples[i] = (float)Math.sqrt(samples[i]);
              }
          }


          public static void main(String[] argv) throws Exception {
              float samples[] = new float[4000];
              double samplesd[] = new double[4000];
              for (int i=0;i<samples.length;i++){
                      samples[i] = i;
                      samplesd[i] = i;

              }
              Sqrt sqrt = new Sqrt();
              for (int i=0;i<10000;i++){
                      sqrt.sqrtFloat(samples);
                      sqrt.sqrtDouble(samplesd);
              }

          }
      }

      ---------- END SOURCE ----------

      CUSTOMER SUBMITTED WORKAROUND :
      Use slow version.

      Attachments

        1. sqrt_useavx0_output.txt
          138 kB
        2. sqrt_vecavx2_output.txt
          173 kB
        3. sqrt_vecavx512_output.txt
          81 kB
        4. Sqrt.java
          0.9 kB

        Issue Links

          Activity

            People

              rlupusoru Razvan Lupusoru
              webbuggrp Webbug Group
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: