-
Bug
-
Resolution: Fixed
-
P4
-
11, 17, 19, 21, 22, 23
-
b05
For example:
vmovd %xmm9,0x13(%rdx,%r13,1)
I have 5 examples.
I would be interested to have confirmation from a machine that actually requires AlignVector, if this ever leads to true failures.
-------------------------- Test::test0 --------------------------
./java -Xcomp -XX:-TieredCompilation -XX:CompileCommand=compileonly,Test::test0 -XX:+TraceNewVectors -XX:+TraceSuperWord -XX:+TraceLoopOpts -XX:MaxVectorSize=16 -XX:+AlignVector -XX:+Verbose -XX:LoopUnrollLimit=10000 Test.java
This and test1 are some control tests, just to see that we get vectorization in the safe cases.
Here, I get packs that are 0-aligned or 8-aligned, with length of 4 bytes per vector.
-------------------------- Test::test1 --------------------------
./java -Xcomp -XX:-TieredCompilation -XX:CompileCommand=compileonly,Test::test1 -XX:+TraceNewVectors -XX:+TraceSuperWord -XX:+TraceLoopOpts -XX:MaxVectorSize=16 -XX:+AlignVector -XX:+Verbose -XX:LoopUnrollLimit=10000 Test.java
This is also a control test for vectorization in a safe case.
I get 16-packs, all 0-aligned. Good.
-------------------------- Test::test2 --------------------------
./java -Xcomp -XX:-TieredCompilation -XX:CompileCommand=compileonly,Test::test2 -XX:+TraceNewVectors -XX:+TraceSuperWord -XX:+TraceLoopOpts -XX:MaxVectorSize=16 -XX:+AlignVector -XX:+Verbose -XX:LoopUnrollLimit=10000 Test.java
This case is somewhat surprising, we actually do not vectorize even though we technically would be allowed to.
The issue is that find_adjacent_refs seems to find no memref to align to, I think the issue is that we have no memref that is 0-aligned with the vector-width (no "i+0" case).
-------------------------- Test::test3 --------------------------
./java -Xcomp -XX:-TieredCompilation -XX:CompileCommand=compileonly,Test::test3 -XX:+TraceNewVectors -XX:+TraceSuperWord -XX:+TraceLoopOpts -XX:MaxVectorSize=16 -XX:+AlignVector -XX:+Verbose -XX:LoopUnrollLimit=10000 Test.java
This case is the same as test2, but we have an additional access at "i+0".
Now find_adjacent_refs finds a memref to align to. But it turns out later that it actually is not part of a vector!
But we do create some 4-packs, they are 3 or 11 aligned, however!
And these are some assembly instructions I can find with -XX:CompileCommand=print,Test::test3:
vmovd %xmm9,0x13(%rdx,%r13,1)
It is possible that this still gets aligned, as everything is at a "3-offset", but given that we align to the best-memref found in find_adjacent_refs, this is implausible: that one has a 0-alignment, while the vectors have a 3/11 alignment.
Pack: 0
align: 3 2063 StoreB === 2274 2066 2087 2064 [[ 2060 2062 ]] @byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):BotPTR:exact+any *, idx=6; Memory: @byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):NotNull:exact+any *, idx=6; !orig=1634,1362,1153,215 !jvms: Test::test3 @ bci:34 (line 56)
align: 4 2060 StoreB === 2274 2063 2089 2061 [[ 2055 2059 ]] @byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):BotPTR:exact+any *, idx=6; Memory: @byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):NotNull:exact+any *, idx=6; !orig=1631,1349,1150,263 !jvms: Test::test3 @ bci:47 (line 57)
align: 5 2055 StoreB === 2274 2060 2056 2058 [[ 2052 2054 ]] @byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):BotPTR:exact+any *, idx=6; Memory: @byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):NotNull:exact+any *, idx=6; !orig=1626,1346,1147,311 !jvms: Test::test3 @ bci:60 (line 58)
align: 6 2052 StoreB === 2274 2055 2098 2053 [[ 2049 2051 ]] @byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):BotPTR:exact+any *, idx=6; Memory: @byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):NotNull:exact+any *, idx=6; !orig=1623,1343,1143,359,1207 !jvms: Test::test3 @ bci:75 (line 59)
Pack: 24
align: 11 2046 StoreB === 2274 2049 2095 2047 [[ 2043 2045 ]] @byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):BotPTR:exact+any *, idx=6; Memory: @byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):NotNull:exact+any *, idx=6; !orig=1617,1337,215 !jvms: Test::test3 @ bci:34 (line 56)
align: 12 2043 StoreB === 2274 2046 2096 2044 [[ 2040 2042 ]] @byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):BotPTR:exact+any *, idx=6; Memory: @byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):NotNull:exact+any *, idx=6; !orig=1614,1334,263 !jvms: Test::test3 @ bci:47 (line 57)
align: 13 2040 StoreB === 2274 2043 2092 2041 [[ 2037 2039 ]] @byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):BotPTR:exact+any *, idx=6; Memory: @byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):NotNull:exact+any *, idx=6; !orig=1611,1331,311 !jvms: Test::test3 @ bci:60 (line 58)
align: 14 2037 StoreB === 2274 2040 2088 2038 [[ 2002 2004 ]] @byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):BotPTR:exact+any *, idx=6; Memory: @byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):NotNull:exact+any *, idx=6; !orig=1608,1327,359,1207 !jvms: Test::test3 @ bci:75 (line 59)
-------------------------- Test::test4 --------------------------
./java -Xcomp -XX:-TieredCompilation -XX:CompileCommand=compileonly,Test::test4 -XX:+TraceNewVectors -XX:+TraceSuperWord -XX:+TraceLoopOpts -XX:MaxVectorSize=16 -XX:+AlignVector -XX:+Verbose -XX:LoopUnrollLimit=10000 Test.java
Run with - to see the assembly generated, for example I see:
vmovd 0x20(%rsi,%r11,1),%xmm14
vmovq 0x25(%rsi,%r11,1),%xmm26
These cannot possibly be aligned, their offset is odd!
This is from -XX:+TraceSuperWord, we see a 4-pack (0 aligned) and an 8-pack (5-aligned), this corresponds to the two assembly instructions above:
Pack: 0
align: 0 3139 StoreB === 3358 3362 3155 3140 [[ 3136 3138 ]] @byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):BotPTR:exact+any *, idx=6; Memory: @byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):NotNull:exact+any *, idx=6; !orig=2517,2124,1801,187 !jvms: Test::test4 @ bci:30 (line 53)
align: 1 3136 StoreB === 3358 3139 3166 3137 [[ 3133 3135 ]] @byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):BotPTR:exact+any *, idx=6; Memory: @byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):NotNull:exact+any *, idx=6; !orig=2514,2121,1798,238 !jvms: Test::test4 @ bci:49 (line 54)
align: 2 3133 StoreB === 3358 3136 3163 3134 [[ 3130 3132 ]] @byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):BotPTR:exact+any *, idx=6; Memory: @byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):NotNull:exact+any *, idx=6; !orig=2511,2118,1795,290 !jvms: Test::test4 @ bci:68 (line 55)
align: 3 3130 StoreB === 3358 3133 3165 3131 [[ 3127 3129 ]] @byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):BotPTR:exact+any *, idx=6; Memory: @byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):NotNull:exact+any *, idx=6; !orig=2508,2115,1792,342 !jvms: Test::test4 @ bci:87 (line 56)
Pack: 48
align: 5 3127 StoreB === 3358 3130 3160 3128 [[ 3124 3126 ]] @byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):BotPTR:exact+any *, idx=6; Memory: @byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):NotNull:exact+any *, idx=6; !orig=2505,2112,1789,394 !jvms: Test::test4 @ bci:106 (line 58)
align: 6 3124 StoreB === 3358 3127 3158 3125 [[ 3121 3123 ]] @byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):BotPTR:exact+any *, idx=6; Memory: @byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):NotNull:exact+any *, idx=6; !orig=2502,2109,1786,446 !jvms: Test::test4 @ bci:127 (line 59)
align: 7 3121 StoreB === 3358 3124 3156 3122 [[ 3118 3120 ]] @byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):BotPTR:exact+any *, idx=6; Memory: @byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):NotNull:exact+any *, idx=6; !orig=2499,2106,1783,498 !jvms: Test::test4 @ bci:148 (line 60)
align: 8 3118 StoreB === 3358 3121 3164 3119 [[ 3115 3117 ]] @byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):BotPTR:exact+any *, idx=6; Memory: @byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):NotNull:exact+any *, idx=6; !orig=2496,2103,1780,550 !jvms: Test::test4 @ bci:169 (line 61)
align: 9 3115 StoreB === 3358 3118 3159 3116 [[ 3112 3114 ]] @byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):BotPTR:exact+any *, idx=6; Memory: @byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):NotNull:exact+any *, idx=6; !orig=2493,2100,1777,602 !jvms: Test::test4 @ bci:190 (line 62)
align: 10 3112 StoreB === 3358 3115 3162 3113 [[ 3109 3111 ]] @byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):BotPTR:exact+any *, idx=6; Memory: @byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):NotNull:exact+any *, idx=6; !orig=2490,2097,1774,654 !jvms: Test::test4 @ bci:211 (line 63)
align: 11 3109 StoreB === 3358 3112 3157 3110 [[ 3106 3108 ]] @byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):BotPTR:exact+any *, idx=6; Memory: @byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):NotNull:exact+any *, idx=6; !orig=2487,2094,1771,706 !jvms: Test::test4 @ bci:232 (line 64)
align: 12 3106 StoreB === 3358 3109 3161 3107 [[ 3103 3105 ]] @byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):BotPTR:exact+any *, idx=6; Memory: @byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):NotNull:exact+any *, idx=6; !orig=2484,2091,1768,758,1841 !jvms: Test::test4 @ bci:253 (line 65)
- duplicates
-
JDK-8309662 [IR Framework] Add AlignVector to whitelist
- Closed
-
JDK-8303827 C2 SuperWord: allow more fine grained alignment for +AlignVector
- Closed
-
JDK-8311586 C2 SuperWord: introduce VerifyAlignVector (runtime alignment check)
- Closed
- is blocked by
-
JDK-8316594 C2 SuperWord: wrong result with hand unrolled loops
- Resolved
-
JDK-8313717 C2: assert(false) failed: infinite loop in PhaseIterGVN::optimize
- Closed
- relates to
-
JDK-8309662 [IR Framework] Add AlignVector to whitelist
- Closed
-
JDK-8344424 C2 SuperWord: mixed type loops do not vectorize with UseCompactObjectHeaders and AlignVector
- Open
-
JDK-8328938 C2 SuperWord: disable vectorization for large stride and scale
- Resolved
-
JDK-8303827 C2 SuperWord: allow more fine grained alignment for +AlignVector
- Closed
-
JDK-8320587 TestAlignVectorFuzzer fails on ARM32
- Open
-
JDK-8323582 C2 SuperWord AlignVector: misaligned vector memory access with unaligned native memory
- In Progress
-
JDK-8314612 TestUnorderedReduction.java fails with -XX:MaxVectorSize=32 and -XX:+AlignVector
- Resolved
-
JDK-8323641 Test compiler/loopopts/superword/TestAlignVectorFuzzer.java timed out
- Resolved
-
JDK-8339349 Crash in the GC running the DaCapo spring benchmark
- Closed
-
JDK-8323577 C2 SuperWord: remove AlignVector restrictions on IR tests added in JDK-8305055
- Resolved