-
Enhancement
-
Resolution: Unresolved
-
P4
-
26
I discovered yet another issue with missing RCE during work on JDK-8324751.
At first, I thought I only had issues with getAtIndex JDK-8360204, but it also happens with get/set it seems. These may be possible duplicates!
NOTE: we already have a test integrated with JDK-8324751, so just edit that one:
test/hotspot/jtreg/compiler/loopopts/superword/TestMemorySegment_8365982.java
We see that the loop does some "Predicate RC" before unrolling, but it seems it does not manage to get all range checks at the beginning.
After unrolling 4x, it multiversions (strange!).
Then, with PreMainPost, we are doing "RangeCheck", and now the main-loop of the multiversion_fast loop does no have any rangechecks any more.
Question: why was this RangeCheck not eliminated at the beginning?
Why do we multiversion? Did we lose contact to the predicates somewhere in between?
java -XX:CompileCommand=compileonly,Test::test -XX:CompileCommand=printcompilation,Test::test -Xbatch -XX:+TraceLoopOpts Test.java
5678 111 b 4 Test::test (133 bytes)
Counted Loop: N1971/N1969 limit_check short_running profile_predicated predicated sfpts={ 1809 }
Loop: N0/N0 has_sfpt
Loop: N1971/N1969 limit_check short_running profile_predicated predicated sfpts={ 1809 }
Loop: N0/N0 has_sfpt
Loop: N1971/N1969 limit_check short_running profile_predicated predicated sfpts={ 1809 }
Predicate IC Loop: N1971/N1969 limit_check short_running profile_predicated predicated rce sfpts={ 1809 }
Predicate IC Loop: N1971/N1969 limit_check short_running profile_predicated predicated rce sfpts={ 1809 }
Predicate IC Loop: N1971/N1969 limit_check short_running profile_predicated predicated rce sfpts={ 1809 }
Predicate IC Loop: N1971/N1969 limit_check short_running profile_predicated predicated rce sfpts={ 1809 }
Predicate IC Loop: N1971/N1969 limit_check short_running profile_predicated predicated rce sfpts={ 1809 }
Predicate IC Loop: N1971/N1969 limit_check short_running profile_predicated predicated rce sfpts={ 1809 }
Predicate IC Loop: N1971/N1969 limit_check short_running profile_predicated predicated rce sfpts={ 1809 }
Predicate IC Loop: N1971/N1969 limit_check short_running profile_predicated predicated rce sfpts={ 1809 }
Predicate IC Loop: N1971/N1969 limit_check short_running profile_predicated predicated rce sfpts={ 1809 }
Predicate IC Loop: N1971/N1969 limit_check short_running profile_predicated predicated rce sfpts={ 1809 }
Predicate IC Loop: N1971/N1969 limit_check short_running profile_predicated predicated rce sfpts={ 1809 }
Predicate IC Loop: N1971/N1969 limit_check short_running profile_predicated predicated rce sfpts={ 1809 }
Predicate IC Loop: N1971/N1969 limit_check short_running profile_predicated predicated rce sfpts={ 1809 }
Predicate IC Loop: N1971/N1969 limit_check short_running profile_predicated predicated rce sfpts={ 1809 }
Predicate IC Loop: N1971/N1969 limit_check short_running profile_predicated predicated rce sfpts={ 1809 }
Predicate IC Loop: N1971/N1969 limit_check short_running profile_predicated predicated rce sfpts={ 1809 }
Predicate IC Loop: N1971/N1969 limit_check short_running profile_predicated predicated rce sfpts={ 1809 }
Predicate IC Loop: N1971/N1969 limit_check short_running profile_predicated predicated rce sfpts={ 1809 }
Loop: N0/N0 has_sfpt
Loop: N1971/N1969 limit_check short_running profile_predicated predicated sfpts={ 1809 }
Exceeding node budget: 0 < 62
Counted Loop: N2328/N2322 counted [0,int),-1 (-1 iters)
Loop: N0/N0 has_sfpt
Loop: N2327/N2326 limit_check short_running profile_predicated predicated
Loop: N2328/N2322 limit_check short_running profile_predicated predicated counted [0,int),-1 (-1 iters) has_sfpt strip_mined
Predicate RC Loop: N2328/N2322 limit_check short_running profile_predicated predicated counted [0,int),-1 (1000 iters) has_sfpt strip_mined
Loop: N0/N0 has_sfpt
Loop: N2327/N2326 limit_check short_running profile_predicated predicated sfpts={ 2329 }
Loop: N2328/N2322 limit_check short_running profile_predicated predicated counted [0,int),-1 (1000 iters) rc has_sfpt strip_mined
PreMainPost Loop: N2328/N2322 limit_check short_running profile_predicated predicated counted [0,int),-1 (1000 iters) rc has_sfpt strip_mined
Unroll 2 Loop: N2328/N2322 limit_check counted [int,int),-1 (1000 iters) main rc has_sfpt strip_mined
Loop: N0/N0 has_sfpt
Loop: N2536/N2534 limit_check short_running profile_predicated predicated counted [0,int),-1 (4 iters) pre rc
Loop: N2327/N2326 limit_check sfpts={ 2329 }
Loop: N2651/N2322 limit_check counted [int,int),-2 (1000 iters) main rc has_sfpt strip_mined
Loop: N2433/N2431 limit_check counted [int,int),-1 (4 iters) post rc
Unroll 4 Loop: N2651/N2322 limit_check counted [int,int),-2 (1000 iters) main rc has_sfpt rce strip_mined
Loop: N0/N0 has_sfpt
Loop: N2536/N2534 limit_check short_running profile_predicated predicated counted [0,int),-1 (4 iters) pre rc
Loop: N2327/N2326 limit_check sfpts={ 2329 }
Loop: N2835/N2322 limit_check counted [int,int),-4 (1000 iters) main rc has_sfpt strip_mined
Loop: N2433/N2431 limit_check counted [int,int),-1 (4 iters) post rc
Exceeding node budget: 0 < 106
Loop: N0/N0 has_sfpt
Loop: N2536/N2534 limit_check short_running profile_predicated predicated counted [0,int),-1 (4 iters) pre rc
Loop: N2327/N2326 limit_check sfpts={ 2329 }
Loop: N2835/N2322 limit_check counted [0,int),-4 (1000 iters) rc has_sfpt strip_mined
Loop: N2433/N2431 limit_check counted [int,int),-1 (4 iters) post rc
Multiversion Loop: N2835/N2322 limit_check counted [0,int),-4 (1000 iters) rc has_sfpt rce strip_mined
PreMainPost Loop: N2835/N2322 limit_check counted [0,int),-4 (1000 iters) rc multiversion_fast has_sfpt rce strip_mined
RangeCheck Loop: N2835/N2322 limit_check counted [int,int),-4 (1000 iters) main rc multiversion_fast has_sfpt rce strip_mined
Loop: N0/N0 has_sfpt
Loop: N2536/N2534 limit_check short_running profile_predicated predicated counted [0,int),-1 (4 iters) pre rc
Loop: N3442/N3443 limit_check sfpts={ 3445 }
Loop: N3427/N3438 limit_check counted [0,int),-4 (1000 iters) rc multiversion_delayed_slow has_sfpt strip_mined
Loop: N3714/N3725 counted [0,int),-4 (4 iters) pre rc multiversion_fast
Loop: N2327/N2326 limit_check sfpts={ 2329 }
Loop: N2835/N2322 limit_check counted [int,int),-4 (1000 iters) main multiversion_fast has_sfpt strip_mined
Loop: N3570/N3581 limit_check counted [int,int),-4 (4 iters) post rc multiversion_fast
Loop: N2433/N2431 limit_check counted [int,int),-1 (4 iters) post rc
Unroll 8 Loop: N2835/N2322 limit_check counted [int,int),-4 (1000 iters) main multiversion_fast has_sfpt strip_mined
Loop: N0/N0 has_sfpt
Loop: N2536/N2534 limit_check short_running profile_predicated predicated counted [0,int),-1 (4 iters) pre rc
Loop: N3442/N3443 limit_check sfpts={ 3445 }
Loop: N3427/N3438 limit_check counted [0,int),-4 (1000 iters) rc multiversion_delayed_slow has_sfpt strip_mined
Loop: N3714/N3725 counted [0,int),-4 (4 iters) pre rc multiversion_fast
Loop: N2327/N2326 limit_check sfpts={ 2329 }
Loop: N4339/N2322 limit_check counted [int,int),-8 (1000 iters) main multiversion_fast has_sfpt strip_mined
Loop: N3570/N3581 limit_check counted [int,int),-4 (4 iters) post rc multiversion_fast
Loop: N2433/N2431 limit_check counted [int,int),-1 (4 iters) post rc
Unroll 16 Loop: N4339/N2322 limit_check counted [int,int),-8 (1000 iters) main multiversion_fast has_sfpt strip_mined
Loop: N0/N0 has_sfpt
Loop: N2536/N2534 limit_check short_running profile_predicated predicated counted [0,int),-1 (4 iters) pre rc
Loop: N3442/N3443 limit_check sfpts={ 3445 }
Loop: N3427/N3438 limit_check counted [0,int),-4 (1000 iters) rc multiversion_delayed_slow has_sfpt strip_mined
Loop: N3714/N3725 counted [0,int),-4 (4 iters) pre rc multiversion_fast
Loop: N2327/N2326 limit_check sfpts={ 2329 }
Loop: N4657/N2322 limit_check counted [int,int),-16 (1000 iters) main multiversion_fast has_sfpt strip_mined
Loop: N3570/N3581 limit_check counted [int,int),-4 (4 iters) post rc multiversion_fast
Loop: N2433/N2431 limit_check counted [int,int),-1 (4 iters) post rc
Unroll 32 Loop: N4657/N2322 limit_check counted [int,int),-16 (1000 iters) main multiversion_fast has_sfpt strip_mined
Loop: N0/N0 has_sfpt
Loop: N2536/N2534 limit_check short_running profile_predicated predicated counted [0,int),-1 (4 iters) pre rc
Loop: N3442/N3443 limit_check sfpts={ 3445 }
Loop: N3427/N3438 limit_check counted [0,int),-4 (1000 iters) rc multiversion_delayed_slow has_sfpt strip_mined
Loop: N3714/N3725 counted [0,int),-4 (4 iters) pre rc multiversion_fast
Loop: N2327/N2326 limit_check sfpts={ 2329 }
Loop: N5064/N2322 limit_check counted [int,int),-32 (1000 iters) main multiversion_fast has_sfpt strip_mined
Loop: N3570/N3581 limit_check counted [int,int),-4 (4 iters) post rc multiversion_fast
Loop: N2433/N2431 limit_check counted [int,int),-1 (4 iters) post rc
Loop: N0/N0 has_sfpt
Loop: N2536/N2534 limit_check short_running profile_predicated predicated counted [0,int),-1 (4 iters) pre rc
Loop: N3442/N3443 limit_check sfpts={ 3445 }
Loop: N3427/N3438 limit_check counted [0,int),-4 (1000 iters) rc multiversion_delayed_slow has_sfpt strip_mined
Loop: N3714/N3725 counted [0,int),-4 (4 iters) pre rc multiversion_fast
Loop: N2327/N2326 limit_check sfpts={ 2329 }
Loop: N5064/N2322 limit_check counted [int,int),-32 (1000 iters) main multiversion_fast has_sfpt strip_mined
Loop: N3570/N3581 limit_check counted [int,int),-4 (4 iters) post rc multiversion_fast
Loop: N2433/N2431 limit_check counted [int,int),-1 (4 iters) post rc
PredicatesOff
Loop: N0/N0 has_sfpt
Loop: N2536/N2534 short_running predicated counted [0,int),-1 (4 iters) pre rc
Loop: N3442/N3443 limit_check sfpts={ 3445 }
Loop: N3427/N3438 limit_check counted [0,int),-4 (1000 iters) rc multiversion_delayed_slow has_sfpt strip_mined
Loop: N3714/N3725 counted [0,int),-4 (4 iters) pre rc multiversion_fast
Loop: N2327/N2326 limit_check sfpts={ 2329 }
Loop: N5064/N2322 limit_check counted [int,int),-32 (1000 iters) main multiversion_fast has_sfpt strip_mined
Loop: N3570/N3581 limit_check counted [int,int),-4 (4 iters) post rc multiversion_fast
Loop: N2433/N2431 limit_check counted [int,int),-1 (4 iters) post rc
At first, I thought I only had issues with getAtIndex JDK-8360204, but it also happens with get/set it seems. These may be possible duplicates!
NOTE: we already have a test integrated with JDK-8324751, so just edit that one:
test/hotspot/jtreg/compiler/loopopts/superword/TestMemorySegment_8365982.java
We see that the loop does some "Predicate RC" before unrolling, but it seems it does not manage to get all range checks at the beginning.
After unrolling 4x, it multiversions (strange!).
Then, with PreMainPost, we are doing "RangeCheck", and now the main-loop of the multiversion_fast loop does no have any rangechecks any more.
Question: why was this RangeCheck not eliminated at the beginning?
Why do we multiversion? Did we lose contact to the predicates somewhere in between?
java -XX:CompileCommand=compileonly,Test::test -XX:CompileCommand=printcompilation,Test::test -Xbatch -XX:+TraceLoopOpts Test.java
5678 111 b 4 Test::test (133 bytes)
Counted Loop: N1971/N1969 limit_check short_running profile_predicated predicated sfpts={ 1809 }
Loop: N0/N0 has_sfpt
Loop: N1971/N1969 limit_check short_running profile_predicated predicated sfpts={ 1809 }
Loop: N0/N0 has_sfpt
Loop: N1971/N1969 limit_check short_running profile_predicated predicated sfpts={ 1809 }
Predicate IC Loop: N1971/N1969 limit_check short_running profile_predicated predicated rce sfpts={ 1809 }
Predicate IC Loop: N1971/N1969 limit_check short_running profile_predicated predicated rce sfpts={ 1809 }
Predicate IC Loop: N1971/N1969 limit_check short_running profile_predicated predicated rce sfpts={ 1809 }
Predicate IC Loop: N1971/N1969 limit_check short_running profile_predicated predicated rce sfpts={ 1809 }
Predicate IC Loop: N1971/N1969 limit_check short_running profile_predicated predicated rce sfpts={ 1809 }
Predicate IC Loop: N1971/N1969 limit_check short_running profile_predicated predicated rce sfpts={ 1809 }
Predicate IC Loop: N1971/N1969 limit_check short_running profile_predicated predicated rce sfpts={ 1809 }
Predicate IC Loop: N1971/N1969 limit_check short_running profile_predicated predicated rce sfpts={ 1809 }
Predicate IC Loop: N1971/N1969 limit_check short_running profile_predicated predicated rce sfpts={ 1809 }
Predicate IC Loop: N1971/N1969 limit_check short_running profile_predicated predicated rce sfpts={ 1809 }
Predicate IC Loop: N1971/N1969 limit_check short_running profile_predicated predicated rce sfpts={ 1809 }
Predicate IC Loop: N1971/N1969 limit_check short_running profile_predicated predicated rce sfpts={ 1809 }
Predicate IC Loop: N1971/N1969 limit_check short_running profile_predicated predicated rce sfpts={ 1809 }
Predicate IC Loop: N1971/N1969 limit_check short_running profile_predicated predicated rce sfpts={ 1809 }
Predicate IC Loop: N1971/N1969 limit_check short_running profile_predicated predicated rce sfpts={ 1809 }
Predicate IC Loop: N1971/N1969 limit_check short_running profile_predicated predicated rce sfpts={ 1809 }
Predicate IC Loop: N1971/N1969 limit_check short_running profile_predicated predicated rce sfpts={ 1809 }
Predicate IC Loop: N1971/N1969 limit_check short_running profile_predicated predicated rce sfpts={ 1809 }
Loop: N0/N0 has_sfpt
Loop: N1971/N1969 limit_check short_running profile_predicated predicated sfpts={ 1809 }
Exceeding node budget: 0 < 62
Counted Loop: N2328/N2322 counted [0,int),-1 (-1 iters)
Loop: N0/N0 has_sfpt
Loop: N2327/N2326 limit_check short_running profile_predicated predicated
Loop: N2328/N2322 limit_check short_running profile_predicated predicated counted [0,int),-1 (-1 iters) has_sfpt strip_mined
Predicate RC Loop: N2328/N2322 limit_check short_running profile_predicated predicated counted [0,int),-1 (1000 iters) has_sfpt strip_mined
Loop: N0/N0 has_sfpt
Loop: N2327/N2326 limit_check short_running profile_predicated predicated sfpts={ 2329 }
Loop: N2328/N2322 limit_check short_running profile_predicated predicated counted [0,int),-1 (1000 iters) rc has_sfpt strip_mined
PreMainPost Loop: N2328/N2322 limit_check short_running profile_predicated predicated counted [0,int),-1 (1000 iters) rc has_sfpt strip_mined
Unroll 2 Loop: N2328/N2322 limit_check counted [int,int),-1 (1000 iters) main rc has_sfpt strip_mined
Loop: N0/N0 has_sfpt
Loop: N2536/N2534 limit_check short_running profile_predicated predicated counted [0,int),-1 (4 iters) pre rc
Loop: N2327/N2326 limit_check sfpts={ 2329 }
Loop: N2651/N2322 limit_check counted [int,int),-2 (1000 iters) main rc has_sfpt strip_mined
Loop: N2433/N2431 limit_check counted [int,int),-1 (4 iters) post rc
Unroll 4 Loop: N2651/N2322 limit_check counted [int,int),-2 (1000 iters) main rc has_sfpt rce strip_mined
Loop: N0/N0 has_sfpt
Loop: N2536/N2534 limit_check short_running profile_predicated predicated counted [0,int),-1 (4 iters) pre rc
Loop: N2327/N2326 limit_check sfpts={ 2329 }
Loop: N2835/N2322 limit_check counted [int,int),-4 (1000 iters) main rc has_sfpt strip_mined
Loop: N2433/N2431 limit_check counted [int,int),-1 (4 iters) post rc
Exceeding node budget: 0 < 106
Loop: N0/N0 has_sfpt
Loop: N2536/N2534 limit_check short_running profile_predicated predicated counted [0,int),-1 (4 iters) pre rc
Loop: N2327/N2326 limit_check sfpts={ 2329 }
Loop: N2835/N2322 limit_check counted [0,int),-4 (1000 iters) rc has_sfpt strip_mined
Loop: N2433/N2431 limit_check counted [int,int),-1 (4 iters) post rc
Multiversion Loop: N2835/N2322 limit_check counted [0,int),-4 (1000 iters) rc has_sfpt rce strip_mined
PreMainPost Loop: N2835/N2322 limit_check counted [0,int),-4 (1000 iters) rc multiversion_fast has_sfpt rce strip_mined
RangeCheck Loop: N2835/N2322 limit_check counted [int,int),-4 (1000 iters) main rc multiversion_fast has_sfpt rce strip_mined
Loop: N0/N0 has_sfpt
Loop: N2536/N2534 limit_check short_running profile_predicated predicated counted [0,int),-1 (4 iters) pre rc
Loop: N3442/N3443 limit_check sfpts={ 3445 }
Loop: N3427/N3438 limit_check counted [0,int),-4 (1000 iters) rc multiversion_delayed_slow has_sfpt strip_mined
Loop: N3714/N3725 counted [0,int),-4 (4 iters) pre rc multiversion_fast
Loop: N2327/N2326 limit_check sfpts={ 2329 }
Loop: N2835/N2322 limit_check counted [int,int),-4 (1000 iters) main multiversion_fast has_sfpt strip_mined
Loop: N3570/N3581 limit_check counted [int,int),-4 (4 iters) post rc multiversion_fast
Loop: N2433/N2431 limit_check counted [int,int),-1 (4 iters) post rc
Unroll 8 Loop: N2835/N2322 limit_check counted [int,int),-4 (1000 iters) main multiversion_fast has_sfpt strip_mined
Loop: N0/N0 has_sfpt
Loop: N2536/N2534 limit_check short_running profile_predicated predicated counted [0,int),-1 (4 iters) pre rc
Loop: N3442/N3443 limit_check sfpts={ 3445 }
Loop: N3427/N3438 limit_check counted [0,int),-4 (1000 iters) rc multiversion_delayed_slow has_sfpt strip_mined
Loop: N3714/N3725 counted [0,int),-4 (4 iters) pre rc multiversion_fast
Loop: N2327/N2326 limit_check sfpts={ 2329 }
Loop: N4339/N2322 limit_check counted [int,int),-8 (1000 iters) main multiversion_fast has_sfpt strip_mined
Loop: N3570/N3581 limit_check counted [int,int),-4 (4 iters) post rc multiversion_fast
Loop: N2433/N2431 limit_check counted [int,int),-1 (4 iters) post rc
Unroll 16 Loop: N4339/N2322 limit_check counted [int,int),-8 (1000 iters) main multiversion_fast has_sfpt strip_mined
Loop: N0/N0 has_sfpt
Loop: N2536/N2534 limit_check short_running profile_predicated predicated counted [0,int),-1 (4 iters) pre rc
Loop: N3442/N3443 limit_check sfpts={ 3445 }
Loop: N3427/N3438 limit_check counted [0,int),-4 (1000 iters) rc multiversion_delayed_slow has_sfpt strip_mined
Loop: N3714/N3725 counted [0,int),-4 (4 iters) pre rc multiversion_fast
Loop: N2327/N2326 limit_check sfpts={ 2329 }
Loop: N4657/N2322 limit_check counted [int,int),-16 (1000 iters) main multiversion_fast has_sfpt strip_mined
Loop: N3570/N3581 limit_check counted [int,int),-4 (4 iters) post rc multiversion_fast
Loop: N2433/N2431 limit_check counted [int,int),-1 (4 iters) post rc
Unroll 32 Loop: N4657/N2322 limit_check counted [int,int),-16 (1000 iters) main multiversion_fast has_sfpt strip_mined
Loop: N0/N0 has_sfpt
Loop: N2536/N2534 limit_check short_running profile_predicated predicated counted [0,int),-1 (4 iters) pre rc
Loop: N3442/N3443 limit_check sfpts={ 3445 }
Loop: N3427/N3438 limit_check counted [0,int),-4 (1000 iters) rc multiversion_delayed_slow has_sfpt strip_mined
Loop: N3714/N3725 counted [0,int),-4 (4 iters) pre rc multiversion_fast
Loop: N2327/N2326 limit_check sfpts={ 2329 }
Loop: N5064/N2322 limit_check counted [int,int),-32 (1000 iters) main multiversion_fast has_sfpt strip_mined
Loop: N3570/N3581 limit_check counted [int,int),-4 (4 iters) post rc multiversion_fast
Loop: N2433/N2431 limit_check counted [int,int),-1 (4 iters) post rc
Loop: N0/N0 has_sfpt
Loop: N2536/N2534 limit_check short_running profile_predicated predicated counted [0,int),-1 (4 iters) pre rc
Loop: N3442/N3443 limit_check sfpts={ 3445 }
Loop: N3427/N3438 limit_check counted [0,int),-4 (1000 iters) rc multiversion_delayed_slow has_sfpt strip_mined
Loop: N3714/N3725 counted [0,int),-4 (4 iters) pre rc multiversion_fast
Loop: N2327/N2326 limit_check sfpts={ 2329 }
Loop: N5064/N2322 limit_check counted [int,int),-32 (1000 iters) main multiversion_fast has_sfpt strip_mined
Loop: N3570/N3581 limit_check counted [int,int),-4 (4 iters) post rc multiversion_fast
Loop: N2433/N2431 limit_check counted [int,int),-1 (4 iters) post rc
PredicatesOff
Loop: N0/N0 has_sfpt
Loop: N2536/N2534 short_running predicated counted [0,int),-1 (4 iters) pre rc
Loop: N3442/N3443 limit_check sfpts={ 3445 }
Loop: N3427/N3438 limit_check counted [0,int),-4 (1000 iters) rc multiversion_delayed_slow has_sfpt strip_mined
Loop: N3714/N3725 counted [0,int),-4 (4 iters) pre rc multiversion_fast
Loop: N2327/N2326 limit_check sfpts={ 2329 }
Loop: N5064/N2322 limit_check counted [int,int),-32 (1000 iters) main multiversion_fast has_sfpt strip_mined
Loop: N3570/N3581 limit_check counted [int,int),-4 (4 iters) post rc multiversion_fast
Loop: N2433/N2431 limit_check counted [int,int),-1 (4 iters) post rc
- blocks
-
JDK-8365985 C2 SuperWord: TestAliasingFuzzer.java IR rule checks for no multiversioning and check for load/store vectors
-
- Open
-
- relates to
-
JDK-8360204 C2 SuperWord: missing RCE with MemorySegment.getAtIndex
-
- Open
-
-
JDK-8324751 C2 SuperWord: Aliasing Analysis runtime check
-
- In Progress
-