-
Enhancement
-
Resolution: Unresolved
-
P4
-
21
Do same as JDK-8302652 but for Vector API.
[~jrose] Sketched it like this:
From
int a = 0; for (…) { … a += v.reduceLanes(ADD) … }
to
vector<int> asplit = zeroes(); for (…) { ... asplit = asplit.add(v) … }; int a = asplit.reduceLanes(ADD);
I coded up concreate int-add-reduction example (dot-product).
./java -Xbatch -XX:CompileCommand=printcompilation,Test::* -XX:CompileCommand=exclude,Test::test00 -XX:UseAVX=2 -XX:CompileCommand=print,Test::test11 Test.java > txt1.txt
./java -Xbatch -XX:CompileCommand=printcompilation,Test::* -XX:CompileCommand=exclude,Test::test00 -XX:UseAVX=2 -XX:CompileCommand=print,Test::test12 Test.java > txt1.txt
grepping for "vector_reduction_int", we see that for test11 we have many reductions in the loop, whereas for test12 we only have 2 reductions in the whole compilation.
[~jrose] Sketched it like this:
From
int a = 0; for (…) { … a += v.reduceLanes(ADD) … }
to
vector<int> asplit = zeroes(); for (…) { ... asplit = asplit.add(v) … }; int a = asplit.reduceLanes(ADD);
I coded up concreate int-add-reduction example (dot-product).
./java -Xbatch -XX:CompileCommand=printcompilation,Test::* -XX:CompileCommand=exclude,Test::test00 -XX:UseAVX=2 -XX:CompileCommand=print,Test::test11 Test.java > txt1.txt
./java -Xbatch -XX:CompileCommand=printcompilation,Test::* -XX:CompileCommand=exclude,Test::test00 -XX:UseAVX=2 -XX:CompileCommand=print,Test::test12 Test.java > txt1.txt
grepping for "vector_reduction_int", we see that for test11 we have many reductions in the loop, whereas for test12 we only have 2 reductions in the whole compilation.
- relates to
-
JDK-8315024 Vector API FP reduction tests should not test for exact equality
- Resolved
-
JDK-8302652 [SuperWord] Reduction should happen after loop, when possible
- Resolved