Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8214751

X86: Support for VNNI Instructions

    XMLWordPrintable

Details

    • Enhancement
    • Resolution: Fixed
    • P4
    • 12
    • 12
    • hotspot
    • b24
    • x86

    Backports

      Description

        This is VNNI VPDPWSSD instruction support with autovectorization.

        It can vectorize this operation in the loop:
        out[i] += ((in1[2*i] * in2[2*i]) + (in1[2*i+1] * in2[2*i+1]));

        This patch is useful for AI ML/DL applications such as convolution based Neural Nets.

        More information on VNNI can be found here:
        https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf
        Code contributed by: razvan.a.lupusoru@intel.com and vdeshpande(vivek.r.deshpande@intel.com)

        The initial performance gains with micro on skylake with AVX3 is 10.8x.
         and it generates
        vmovdqu xmm3, xmmword ptr [rbp+r8*2+0x10]
        vmovdqu xmm6, xmmword ptr [rdx+r8*2+0x10]
        vpmaddwd xmm3, xmm6, xmm3
        vpaddd xmm3, xmm3, xmmword ptr [r9+rdi*4+0x10]
        vmovdqu xmmword ptr [r9+rdi*4+0x10], xmm3

        It can generate vpdpwssd instruction on cascadelake.

        The webrev is here:
        http://cr.openjdk.java.net/~vdeshpande/8214751/VNNI/webrev.00/

        Attachments

          Issue Links

            Activity

              People

                vdeshpande Vivek Deshpande (Inactive)
                vdeshpande Vivek Deshpande (Inactive)
                Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                  Created:
                  Updated:
                  Resolved: