Summary
Provide an initial iteration of an [incubator module], jdk.incubator.vector, to express vector computations that reliably compile at runtime to optimal vector hardware instructions on supported CPU architectures and thus achieve superior performance to equivalent scalar computations.
Problem
Vector computations consist of a sequence of operations on vectors. A vector comprises a (usually) fixed sequence of scalar values, where the scalar values correspond to the number of hardware-defined vector lanes. A binary operation applied to two vectors with the same number of lanes would, for each lane, apply the equivalent scalar operation on the corresponding two scalar values from each vector. This is commonly referred to as Single Instruction Multiple Data (SIMD).
Vector operations express a degree of parallelism that enables more work to be performed in a single CPU cycle and thus can result in significant performance gains. For example, given two vectors each covering a sequence of eight integers (eight lanes), then the two vectors can be added together using a single hardware instruction. The vector addition hardware instruction operates on sixteen integers, performing eight integer additions, in the time it would ordinarily take to operate on two integers, performing one integer addition.
HotSpot supports auto-vectorization where scalar operations are transformed into superword operations, which are then mapped to vector hardware instructions. The set of transformable scalar operations are limited and fragile to changes in the code shape. Furthermore, only a subset of available vector hardware instructions might be utilized limiting the performance of generated code.
A developer wishing to write scalar operations that are reliably transformed into superword operations needs to understand HotSpot's auto-vectorization support and its limitations to achieve reliable and sustainable performance.
In some cases it may not be possible for the developer to write scalar
operations that are transformable. For example, HotSpot does not transform the
simple scalar operations for calculating the hash code of an array (see the
Arrays.hashCode
method implementations in the JDK source code), nor can it
auto-vectorize code to lexicographically compare two arrays (which why an
intrinsic was added to perform lexicographical comparison, see
JDK-8033148).
Solution
The Vector API aims to address these issues by providing a mechanism to write
complex vector algorithms in Java, using pre-existing support in HotSpot
for vectorization, but with a user model which makes vectorization far more
predictable and robust. Hand-coded vector loops can express high-performance
algorithms (such as vectorized hashCode
or specialized array comparison)
which an auto-vectorizer may never optimize.
There are numerous domains where this explicitly vectorizing
API may be applicable such as machine learning, linear algebra, cryptography,
finance, and usages within the JDK itself.
Specification
The implementation of Vector API exports the following interfaces in the package jdk.incubator.vector
, defined in module jdk.incubator.vector
.
Interfaces
VectorOperators.Associative Binary associative lane-wise operations that are applicable to vector lane values of some or all lane types.
VectorOperators.Binary Binary lane-wise operations that are applicable to vector lane values of some or all lane types.
VectorOperators.Comparison Binary lane-wise comparisons that are applicable to vector lane values of all lane types.
VectorOperators.Conversion<E,F> Conversion operations that are applicable to vector lane values of specific lane types.
VectorOperators.Operator Lane-wise operations that are applicable to vector lane values of some or all lane types.
VectorOperators.Ternary Ternary lane-wise operations that are applicable to vector lane values of some or all lane types.
VectorOperators.Unary Unary lane-wise operations that are applicable to vector lane values of some or all lane types.
VectorSpecies<E> Interface for managing all vectors of the same combination of element type (ETYPE) and shape.
Classes
ByteVector A specialized Vector representing an ordered immutable sequence of byte values.
DoubleVector A specialized Vector representing an ordered immutable sequence of double values.
FloatVector A specialized Vector representing an ordered immutable sequence of float values.
IntVector A specialized Vector representing an ordered immutable sequence of int values.
LongVector A specialized Vector representing an ordered immutable sequence of long values.
ShortVector A specialized Vector representing an ordered immutable sequence of short values.
Vector<E> A sequence of a fixed number of lanes, all of some fixed element type such as byte, long, or float.
VectorMask<E> A VectorMask represents an ordered immutable sequence of boolean values.
VectorOperators This class consists solely of static constants that describe lane-wise vector operations,
plus nested interfaces which classify them.
VectorShuffle<E> A VectorShuffle represents an ordered immutable sequence of int values called source indexes,
where each source index numerically selects a source lane from a Vector of a compatible vector species.
Enum
VectorShape A VectorShape selects a particular implementation of Vectors.
A vector is represented by the abstract class Vector<E>
, where type variable E corresponds to the boxed type of scalar primitive integral or floating point element types covered by the vector.
Vector<E>
declares a set of methods for common vector operations supported by all element types. To reduce the surface of the api, instead of defining methods for each supported operation,
the api defines methods for each category of operations (such as lanewise(), reduceLanes(), compare(), etc). The operation to be performed is specified with an operator parameter.
The supported operators are defined in VectorOperators
class as static final instances of VectorOperators.Operator
interface and its sub-interfaces. The sub-interfaces correspond
to the classification of operators into groups such as unary (e.g. negation), binary (e.g. addition), comparison (e.g. lessThan), etc. Having said that, some common operations (such as add(), or())
are provided their own named methods.
The package has specialized implementations of Vector<E>
for each E
in the set {Byte, Short, Int, Long, Float, Double}. These classes export operations specific to an element type such as such as bitwise operations (e.g. logical or) which are specific to integral sub-types and mathematical operations (e.g. transcendental functions like pow()) for floating point sub-types.
A Vector has an element type which is represented by the type variable E
and a shape which defines the size, in bits. Enum VectorShape
is the enum of shapes supported by the api.
The element type and shape together form a species represented by VectorSpecies<E>
. Species play a role in creation and type conversion of vectors, masks and shuffles.
To support control flow relevant vector operations will optionally accept masks, represented by the public abstract class VectorMask<E>
. Each element
in a mask, a boolean value or bit, corresponds to a vector lane. When a mask is an input to an operation it governs whether the operation
is applied to a particular lane; the operation is applied for a lane(s) if the mask bit for that lane is set (is true). Alternative behavior occurs if the
mask bit is not set (is false). Comparison operations produce masks, which can then be input to other operations to selectively disable the
operation on certain lanes and thereby emulate flow control.
A VectorShuffle represents an ordered immutable sequence of int values. A VectorShuffle can be used with a shuffle accepting vector operation to control the rearrangement of lane elements of input vectors.
The javadoc for the package with the implementation as of July 18, 2019 is at http://cr.openjdk.java.net/~kkharbas/vector-api/CSR/javadoc.02/jdk.incubator.vector/jdk/incubator/vector/package-summary.html and also attached here.
More details can be found in the JEP issue - https://bugs.openjdk.java.net/browse/JDK-8201271
- csr of
-
JDK-8223347 Integration of Vector API (Incubator)
-
- Resolved
-
- relates to
-
JDK-8254622 Hide superclasses from conditionally exported packages
-
- In Progress
-