-
Enhancement
-
Resolution: Fixed
-
P4
-
17, 18, 19
-
b05
-
aarch64
-
generic
Issue | Fix Version | Assignee | Priority | Status | Resolution | Resolved In Build |
---|---|---|---|---|---|---|
JDK-8282793 | 17.0.4-oracle | Tobias Hartmann | P4 | Resolved | Fixed | b01 |
JDK-8282654 | 17.0.4 | Dmitry Chuyko | P4 | Resolved | Fixed | b01 |
After JDK-8268231, there is a following code shape:
if (SoftwarePrefetchHintDistance >= 0) {
__ bind(LARGE_LOOP_PREFETCH);
__ prfm(Address(str1, SoftwarePrefetchHintDistance));
__ prfm(Address(str2, SoftwarePrefetchHintDistance));
__ align(OptoLoopAlignment);
for (int i = 0; i < 4; i++) {
__ ldp(tmp1, tmp1h, Address(str1, i * 16));
__ ldp(tmp2, tmp2h, Address(str2, i * 16));
__ cmp(tmp1, tmp2);
__ ccmp(tmp1h, tmp2h, 0, Assembler::EQ);
__ br(Assembler::NE, DIFF);
}
__ sub(cnt2, cnt2, isLL ? 64 : 32);
__ add(str1, str1, 64);
__ add(str2, str2, 64);
__ subs(rscratch2, cnt2, largeLoopExitCondition);
__ br(Assembler::GE, LARGE_LOOP_PREFETCH);
__ cbz(cnt2, LENGTH_DIFF); // no more chars left?
}
I believe the intent is to make sure that LARGE_LOOP_PREFETCH jump target is aligned. In other words, should probably be like this:
__ align(OptoLoopAlignment);
__ bind(LARGE_LOOP_PREFETCH);
__ prfm(Address(str1, SoftwarePrefetchHintDistance));
__ prfm(Address(str2, SoftwarePrefetchHintDistance));
This is the form that every other use of align(OptoLoopAlignment) takes. Current shape probably has some minor performance penalties.
Tentatively assigning to Wang Huang who did the originalJDK-8268231.
if (SoftwarePrefetchHintDistance >= 0) {
__ bind(LARGE_LOOP_PREFETCH);
__ prfm(Address(str1, SoftwarePrefetchHintDistance));
__ prfm(Address(str2, SoftwarePrefetchHintDistance));
__ align(OptoLoopAlignment);
for (int i = 0; i < 4; i++) {
__ ldp(tmp1, tmp1h, Address(str1, i * 16));
__ ldp(tmp2, tmp2h, Address(str2, i * 16));
__ cmp(tmp1, tmp2);
__ ccmp(tmp1h, tmp2h, 0, Assembler::EQ);
__ br(Assembler::NE, DIFF);
}
__ sub(cnt2, cnt2, isLL ? 64 : 32);
__ add(str1, str1, 64);
__ add(str2, str2, 64);
__ subs(rscratch2, cnt2, largeLoopExitCondition);
__ br(Assembler::GE, LARGE_LOOP_PREFETCH);
__ cbz(cnt2, LENGTH_DIFF); // no more chars left?
}
I believe the intent is to make sure that LARGE_LOOP_PREFETCH jump target is aligned. In other words, should probably be like this:
__ align(OptoLoopAlignment);
__ bind(LARGE_LOOP_PREFETCH);
__ prfm(Address(str1, SoftwarePrefetchHintDistance));
__ prfm(Address(str2, SoftwarePrefetchHintDistance));
This is the form that every other use of align(OptoLoopAlignment) takes. Current shape probably has some minor performance penalties.
Tentatively assigning to Wang Huang who did the original
- backported by
-
JDK-8282654 AArch64: generate_compare_long_string_same_encoding and LARGE_LOOP_PREFETCH alignment
- Resolved
-
JDK-8282793 AArch64: generate_compare_long_string_same_encoding and LARGE_LOOP_PREFETCH alignment
- Resolved
- relates to
-
JDK-8268231 Aarch64: Use Ldp in intrinsics for String.compareTo
- Resolved
- links to
-
Commit openjdk/jdk17u-dev/a51a5f03
-
Commit openjdk/jdk/126328cb
-
Review openjdk/jdk17u-dev/191
-
Review openjdk/jdk/7007
(2 links to)