Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8262067

SuperWord loop optimization lost after method inlining

XMLWordPrintable

      It was reported that after method with loop is inlined the loop is not vectorized (not even converted to Counted loop):

      I am encountering a performance issue caused by the interaction between
      method inlining and automatic vectorization.

      Our application aggregates arrays intensively using a method named
      ArrayFloatToArrayFloatVectorBinding.plus() with the following code:

          for (int i = 0; i < srcLen; ++i) {
                  dstArray[i] += srcArray[i];
          }

      When we microbenchmark this method we observe fast performance close to the practical memory bandwidth and when we print the assembly code we observe loop unrolling and automatic vectorization with SIMD instructions.

      In the real application, this method is actually inlined in a higher level
      method named AVector.plus(). Unfortunately, the inlined version of the
      aggregation code is not vectorized anymore.

      This causes a significant performance drop, compared to a run where we explicitly disable the inlining and observe automatically vectorized code
      again (-XX:CompileCommand=dontinline,com/qfs/vector/binding/impl/ArrayFloatToArrayFloatVectorBinding.plus).

            kvn Vladimir Kozlov
            kvn Vladimir Kozlov
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: