A DESCRIPTION OF THE PROBLEM :
Currently effective stride size is 8 which limits which loops can be unrolled.
Originally motivated by Panama vector support discussion:
I.e. For vector operations the loops with int will get unrolled on AVX 256, but short and bytes are not, that's because stride size for int is 8, and short, 16, bytes 32.
Orignal discussion:
https://mail.openjdk.java.net/pipermail/panama-dev/2021-June/014310.html
Old ticket which can be related to this (I can't figure actual cause)
https://mail.openjdk.java.net/pipermail/panama-dev/2021-June/014310.html
Pull request with sample code
https://github.com/openjdk/jdk/pull/4658
Currently effective stride size is 8 which limits which loops can be unrolled.
Originally motivated by Panama vector support discussion:
I.e. For vector operations the loops with int will get unrolled on AVX 256, but short and bytes are not, that's because stride size for int is 8, and short, 16, bytes 32.
Orignal discussion:
https://mail.openjdk.java.net/pipermail/panama-dev/2021-June/014310.html
Old ticket which can be related to this (I can't figure actual cause)
https://mail.openjdk.java.net/pipermail/panama-dev/2021-June/014310.html
Pull request with sample code
https://github.com/openjdk/jdk/pull/4658