Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8263223

conversion of jdk.incubator.vector FloatVector to IntVector is very slow

XMLWordPrintable

    • x86_64
    • os_x

      ADDITIONAL SYSTEM INFORMATION :
      MacBook Pro / 6-Core Intel Core i9

      MacOS Big Sur 11.2.2

      openjdk version "16" 2021-03-16
      OpenJDK Runtime Environment (build 16+36-2231)
      OpenJDK 64-Bit Server VM (build 16+36-2231, mixed mode, sharing)

      A DESCRIPTION OF THE PROBLEM :
      Example code:

      // BEGIN
        private static final VectorSpecies<Float> VFP = FloatVector.SPECIES_MAX;
        private static final VectorSpecies<Integer> VIP = IntVector.SPECIES_MAX;
        final static int STEP = VFP.length();

        static void updateGeGeneric(int angle, int d, int[] rowOffset, float[] regionY, int[] regionX0, int[] regionX1, int[] regionX, int count) {
          FloatVector mNyNx = FloatVector.broadcast(VFP, SinCos.MINUS_COT[angle]);
          FloatVector dNx = FloatVector.broadcast(VFP, (float)(d * SinCos.INV_SIN[angle] + 0.5f));
          IntVector k4 = IntVector.broadcast(VIP, 4);
          for (int i = 0; i < count; i += STEP) {
            FloatVector y = FloatVector.fromArray(VFP, regionY, i);
            IntVector offset = IntVector.fromArray(VIP, rowOffset, i);
            FloatVector xf = y.fma(mNyNx, dNx);
            // NEXT LINE IS SLOW
            IntVector xi = xf.convert(VectorOperators.F2I, 0).reinterpretAsInts();
            IntVector x0 = IntVector.fromArray(VIP, regionX0, i);
            IntVector x1 = IntVector.fromArray(VIP, regionX1, i);
            IntVector x = xi.max(x0).min(x1);
            IntVector xOff = x.add(offset).mul(k4);
            xOff.intoArray(regionX, i);
          }
        }
      // END

      Profiler shows that aforementioned conversion (jdk.incubator.vector.AbstractVector.convert(VectorOperators$Conversion, int)) consumes 99.2% of method time.
      Overall, method performance is 4.85x slower than non-vectorized variant (or worse, depending on used vector species).

      STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
      Compile and run using "--add-modules=jdk.incubator.vector".

      EXPECTED VERSUS ACTUAL BEHAVIOR :
      EXPECTED -
      Vectorized version is faster / same speed as regular one.
      ACTUAL -
      Vectorized version is 4x+ times slower

      ---------- BEGIN SOURCE ----------
      git@github.com:eustas/2im.git
      cd 2im
      git checkout update-java
      cd java
      ant
      echo "Baseline"
      java -jar ./build/jar/twim.jar -e -r -t1024 `pwd`/beach.png
      echo "Vectorized"
      java --add-modules=jdk.incubator.vector -jar ./build/jar/twim.jar -e -r -t1024 `pwd`/beach.png
      ---------- END SOURCE ----------

      CUSTOMER SUBMITTED WORKAROUND :
      Speculative: manually transform floats to ints by var-shift & masking (error-prone for too small / large values).

      FREQUENCY : always


            psandoz Paul Sandoz
            webbuggrp Webbug Group
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: