Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8262356

Optimize existing masked operation support for AVX-512.



    • Enhancement
    • Status: Open
    • P4
    • Resolution: Unresolved
    • 17
    • tbd
    • hotspot
    • None


      - Currently a vector masked operation performs an operation over all the vector lanes followed by a blend operation which selectively updates the result vector under the influence of mask vector.

      - Prior to AVX-512 blending newly computed result with older value was the only way to facilitate masked/predicated vector operations.

      - A non-AVX-512 vector blend instruction probes the MSB bit for each mask vector lane in order to selectively choose between two source vector lanes.
      - With AVX-512 there are two ways in which masked operation can be performed as follows
      Method 1:
             vmask = vector_cmp(mask, ALL_ONES)
             vres = vector_operation vsrc1, vsrc2
             vector_blend(vdst, vres, vmask)
      Method 2:
            opmask = vector_cmp(mask, ALL_ONES)
            ves = vector_operation vsrc1, vsrc2, opmask

      Clearly emitting a predicated vector operation is much more optimal in terms of emitted code size and is energy efficient since a vector operation conditionally operates over portion of vectors.

      - VectorAPI has significantly extended to scope of masked operations, additionally it offer APIs to perform direct mask manipulation e.g. VectorMask.or/and/not. Thus a direct operation over an Opmask register will enable generating efficient code.

      - Using opmask register we can further optimized existing implementation for VectorMask querying operation like VectorMask.firstTrue/lastTrue/anyTrue/allTrue/trueCount.


        Issue Links



              jbhateja Jatin Bhateja
              jbhateja Jatin Bhateja
              0 Vote for this issue
              1 Start watching this issue