Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8262356

Optimize existing masked operation support for AVX-512.

XMLWordPrintable

    • Icon: Enhancement Enhancement
    • Resolution: Unresolved
    • Icon: P4 P4
    • tbd
    • 17
    • hotspot
    • None

      - Currently a vector masked operation performs an operation over all the vector lanes followed by a blend operation which selectively updates the result vector under the influence of mask vector.

      - Prior to AVX-512 blending newly computed result with older value was the only way to facilitate masked/predicated vector operations.

      - A non-AVX-512 vector blend instruction probes the MSB bit for each mask vector lane in order to selectively choose between two source vector lanes.
       
      - With AVX-512 there are two ways in which masked operation can be performed as follows
      Method 1:
             vmask = vector_cmp(mask, ALL_ONES)
             vres = vector_operation vsrc1, vsrc2
             vector_blend(vdst, vres, vmask)
       
      Method 2:
            opmask = vector_cmp(mask, ALL_ONES)
            ves = vector_operation vsrc1, vsrc2, opmask

      Clearly emitting a predicated vector operation is much more optimal in terms of emitted code size and is energy efficient since a vector operation conditionally operates over portion of vectors.

      - VectorAPI has significantly extended to scope of masked operations, additionally it offer APIs to perform direct mask manipulation e.g. VectorMask.or/and/not. Thus a direct operation over an Opmask register will enable generating efficient code.

      - Using opmask register we can further optimized existing implementation for VectorMask querying operation like VectorMask.firstTrue/lastTrue/anyTrue/allTrue/trueCount.

        There are no Sub-Tasks for this issue.

            jbhateja Jatin Bhateja
            jbhateja Jatin Bhateja
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated: