Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-7184394

add intrinsics to use AES instructions



    • b63
    • generic
    • generic



        Please review the following webrev which adds intrinsic support to
        allow some of the com/sun/crypto/provider methods to use AES
        instructions when a processor supports such instructions.

           Modern x86 processors have AES instructions to accelerate AES
           encryption and decryption but Hotspot does not have a way to
           generate such instructions. There is a way to hook in a native
           crypto library using PKCS11 and there are a few native libraries
           that support hardware AES instructions. However, these native
           PKCS11 libraries

              * do not scale well with multiple threads
              * are not supported on all platforms, for instance Hotspot does
                not have PKCS11 support on 64-bit Windows.
              * can be confusing to configure.

        Since this webrev adds intrinsic support for the default
        com/sun/crypto/provider classes, they are supported on all platforms
        and there is no additional configuration required. Measurements have
        shown that they scale very well will multiple threads.

        The rest of this mail describes the scope of the intrinsics and
        summarizes the source file changes.

        -- Tom Deneau

        Scope of the Intrinsics
        When creating a cipher the application specifies a "transformation"
        consisting of "algorithm/mode/padding". For more details see

           * These intrinsics kick in only when the algorithm part is "AES". A
             single block in AES is always 16 bytes and there are intrinsics
             for encrypting or decrypting a single block. These single-block
             intrinsics can work with any mode that uses AES and with any of
             the three AES key sizes (128, 192 or 256 bit).

           * A more optimized multi-block intrinsic can kick in if the
             algorithm/mode is "AES/CBC" (Cipher Block Chaining). Again all
             three AES key sizes are supported. There is no technical reason
             why we couldn't do multi-block intrinsics for the other modes
             (eg, ECB) but I want to get some feedback from the reviewers on
             the implementation before charging off on this path.

           * The padding part is handled by java routines outside of these

        Summary of Changes

        src/cpu/x86/vm/assembler_x86.cpp, hpp
           Defined the aes instructions which are used by the stub routines.

           Actual stub code for the aes intrinsics. As described earlier there
           are both single-block and multi-block intrinsic stubs.

           Note that the stubs make use of the "expanded key" which gets
           created each time the key changes. The expanded key is used by both
           the java code and the intrinsic AES instructions.

           The java code stores the "expanded key" in big-endian 32-bit
           integers. The x86 AES instructions require the expanded key to be
           in little-endian 128-bit words. Hence the pshufb instructions to
           get the key into the little-endian format

        src/cpu/x86/vm/vm_version_x86.cpp, hpp
           Detect and store the aes capability bit in cpuid. A global boolean
           command line flag UseAES can be used to turn off AES even if the
           hardware supports it.

        src/share/vm/opto/runtime.cpp, hpp
           The usual definitions of class names, method names and signatures
           for the java methods that are being intrinsified and the signatures
           for the stubs

           Up until now, every intrinsic was replacing a routine that was
           loaded by the "default" (NULL) class loader.
           com/sun/crypto/provider is not loaded by the default class
           loader so we had to add a check here.

           escape analysis knows about certain stubs, but if it sees a leaf
           stub it also checks against a predefined list. So the new intrinsic
           names were added to the list.


           The main logic for building up the calls to the stubs at compile
           time, assuming the platform has a stub and the global flags have
           not turned these intrinsics off.

           A new helper routine to load a field from an object was added since
           we ended up loading fields in a few places.

           For best performance, we wanted to hook into the multi-block
           encrypt and decrypt methods such as in CipherBlockChaining.java.
           This code is not AES-specific but handles CBC mode for any
           algorithm. (The algorithm part is handled by the enclosed
           "embeddedCipher" object).

           Thus at runtime we want to do the equivalent of an instanceof check
           on embeddedCipher and either call the stub (if it is AESCrypt) or
           call the original java code (if it is some other algorithm
           type). For the CipherBlockChaining.decrypt there is a further
           runtime check that the source and destination are not the same
           array which, because of the way CBC works would require cloning the
           source (cipher).

           Vladimir added some infrastructure to generate predicated
           intrinsics to solve the above problem. A particular intrinsic need
           only specify that it is predicated, and generate the particular
           guard node which if false will take the Java path. This
           infrastructure can be used for future intrinsics that have to make
           such a runtime choice. These changes from Vladimir are in
           callGenerator.cpp, doCall.cpp, and a small bit in library_call.cpp.

           global flags were added to
              * turn off either AES encryption or AES decryption intrinsics separately
              * turn off the multi-block CBC/AES intrinsics.

           By default all of the above are on. These are really there for
           testing, for example one could encrypt using Java and decrypt using
           the intrinsics.

           Also, a UseAES flag to ignore the hardware capability as described above.


          Issue Links



                kvn Vladimir Kozlov
                kvn Vladimir Kozlov
                0 Vote for this issue
                3 Start watching this issue