- 
    Bug 
- 
    Resolution: Fixed
- 
     P4 P4
- 
    17, 18, 19, 20, 21
- 
        b25
- 
        x86_64
                    In the runs of test compiler.loopopts.superword.ProdRed_Double with -XX:+SuperWordReductions and -XX:LoopMaxUnroll>=8 on x86_64, C2 is expected to vectorize the product reduction loop in prodReductionInit(), but it fails to do so for any run on an array of x86_64 CPUs with different vectorization capabilities.
HOW TO REPRODUCE
On a linux-x86_64-server-fastdebug build, run
$ make run-test TEST="compiler/loopopts/superword/ProdRed_Double.java" TEST_VM_OPTS="-XX:CompileCommand=PrintAssembly,compiler.loopopts.superword.ProdRed_Double::prodReductionImplement"
$ grep vector_reduction_double build/linux-x86_64-server-fastdebug/test-support/jtreg_test_hotspot_jtreg_compiler_loopopts_superword_ProdRed_Double_java/compiler/loopopts/superword/ProdRed_Double.jtr
We expect to find some matches of 'vector_reduction_double', but get none.
INITIAL ANALYSIS
SuperWord::construct_bb() relies on ReductionNode::implemented() to identify vectorizable reduction uses [1]. Among other arguments, ReductionNode::implemented() takes the minimum vector size for the reduction type (vlen), and fails trivially if it is less or equal than 1 [2]. This is always the case in the context of SuperWord::construct_bb(), since vlen is just set to the result of Matcher::min_vector_size(), which sinceJDK-8265783 always returns 1 for the 'double' type [3]. Reverting the changes made by JDK-8265783 to Matcher::min_vector_size (in x86.ad) re-enables vectorization of ProdRed_Double.
Thanks to Daniel Skantz for pointing out the issue, found while working onJDK-8294715.
[1] https://github.com/openjdk/jdk/blob/5a4945c0d95423d0ab07762c915e9cb4d3c66abb/src/hotspot/share/opto/superword.cpp#L3355
[2] https://github.com/openjdk/jdk/blob/5a4945c0d95423d0ab07762c915e9cb4d3c66abb/src/hotspot/share/opto/vectornode.cpp#L1468
[3] https://github.com/openjdk/jdk/blob/5a4945c0d95423d0ab07762c915e9cb4d3c66abb/src/hotspot/cpu/x86/x86.ad#L2293-L2295
HOW TO REPRODUCE
On a linux-x86_64-server-fastdebug build, run
$ make run-test TEST="compiler/loopopts/superword/ProdRed_Double.java" TEST_VM_OPTS="-XX:CompileCommand=PrintAssembly,compiler.loopopts.superword.ProdRed_Double::prodReductionImplement"
$ grep vector_reduction_double build/linux-x86_64-server-fastdebug/test-support/jtreg_test_hotspot_jtreg_compiler_loopopts_superword_ProdRed_Double_java/compiler/loopopts/superword/ProdRed_Double.jtr
We expect to find some matches of 'vector_reduction_double', but get none.
INITIAL ANALYSIS
SuperWord::construct_bb() relies on ReductionNode::implemented() to identify vectorizable reduction uses [1]. Among other arguments, ReductionNode::implemented() takes the minimum vector size for the reduction type (vlen), and fails trivially if it is less or equal than 1 [2]. This is always the case in the context of SuperWord::construct_bb(), since vlen is just set to the result of Matcher::min_vector_size(), which since
Thanks to Daniel Skantz for pointing out the issue, found while working on
[1] https://github.com/openjdk/jdk/blob/5a4945c0d95423d0ab07762c915e9cb4d3c66abb/src/hotspot/share/opto/superword.cpp#L3355
[2] https://github.com/openjdk/jdk/blob/5a4945c0d95423d0ab07762c915e9cb4d3c66abb/src/hotspot/share/opto/vectornode.cpp#L1468
[3] https://github.com/openjdk/jdk/blob/5a4945c0d95423d0ab07762c915e9cb4d3c66abb/src/hotspot/cpu/x86/x86.ad#L2293-L2295
- relates to
- 
                    JDK-8294715 Add IR checks to the reduction vectorization tests -           
- Resolved
 
-         
- 
                    JDK-8265783 Create a separate library for x86 Intel SVML assembly intrinsics -           
- Resolved
 
-         
- 
                    JDK-8302139 Speed up SuperWord reduction tests -           
- Closed
 
-         
 
        