Loading...

XML

Word

Printable

Type: Bug
Resolution: Fixed
Priority: P2
Fix Version/s: 19
Affects Version/s: 9, 11, 17, 18, 19
Component/s: core-libs
Labels:

Subcomponent:
java.lang
Resolved In Build:
b05

Issue	Fix Version	Assignee	Priority	Status	Resolution	Resolved In Build
JDK-8280636	18.0.1	Claes Redestad	P2	Resolved	Fixed	b04
JDK-8279962	18	Claes Redestad	P2	Resolved	Fixed	b32
JDK-8279985	17.0.3-oracle	Dukebot	P2	Closed	Fixed	b03
JDK-8280095	17.0.3	Goetz Lindenmaier	P2	Resolved	Fixed	b01
JDK-8280039	11.0.15-oracle	Vladimir Kozlov	P2	Closed	Fixed	b03
JDK-8280701	11.0.15	Goetz Lindenmaier	P2	Resolved	Fixed	b01

While making an attempt to replace the ASCII fast loop in `String.encodeUTF8_UTF16` I noticed that altering the shape of the code so that char c is scope local to each loop helps the performance of the method by helping C2 optimize each loop better. I narrowed it down to something as straightforward as this:

```
diff --git a/src/java.base/share/classes/java/lang/String.java b/src/java.base/share/classes/java/lang/String.java
index abb35ebaeb1..f84d60f92cc 100644
--- a/src/java.base/share/classes/java/lang/String.java
+++ b/src/java.base/share/classes/java/lang/String.java
@@ -1284,14 +1284,17 @@ public final class String
         int sp = 0;
         int sl = val.length >> 1;
         byte[] dst = new byte[sl * 3];
- char c;
- while (sp < sl && (c = StringUTF16.getChar(val, sp)) < '\u0080') {
+ while (sp < sl) {
+ char c = StringUTF16.getChar(val, sp);
+ if (c >= '\u0080') {
+ break;
+ }
             // ascii fast loop;
             dst[dp++] = (byte)c;
             sp++;
         }
         while (sp < sl) {
- c = StringUTF16.getChar(val, sp++);
+ char c = StringUTF16.getChar(val, sp++);
             if (c < 0x80) {
                 dst[dp++] = (byte)c;
             } else if (c < 0x800) {
```

Results on a few micros I'm updating to better stress this code --
Baseline:
```
Benchmark (charsetName) Mode Cnt Score Error Units
StringEncode.WithCharset.encodeUTF16 UTF-8 avgt 15 171.853 ± 10.275 ns/op
StringEncode.WithCharset.encodeUTF16LongEnd UTF-8 avgt 15 1991.586 ± 82.234 ns/op
StringEncode.WithCharset.encodeUTF16LongStart UTF-8 avgt 15 8422.458 ± 473.161 ns/op
```
Patch:
```
Benchmark (charsetName) Mode Cnt Score Error Units
StringEncode.WithCharset.encodeUTF16 UTF-8 avgt 15 128.525 ± 6.573 ns/op
StringEncode.WithCharset.encodeUTF16LongEnd UTF-8 avgt 15 1843.455 ± 72.984 ns/op
StringEncode.WithCharset.encodeUTF16LongStart UTF-8 avgt 15 4124.791 ± 308.683 ns/op
```

Going back, this seem to have been an issue with this code since its inception with JEP 254 in JDK 9.

The micro encodeUTF16LongEnd encodes a longer string which is mostly ASCII but with an non-ASCII codepoint at the end. This exaggerates the usefulness of the ascii loop. encodeUTF16LongStart tests the same string but with the non-ASCII codepoint moved to the front. This stresses the non-ascii loop. We see that the patch above helps in general, but mainly improves the microbenchmark that spends its time in the second loop.

There's likely a compiler bug hiding in plain sight here where the potentially uninitialized local `char c` messes up the loop optimization of the second loop. I think the above patch is reasonable to put back into the JDK while we investigate if/how C2 can better handle this pattern.

backported by

JDK-8279962 Loop optimization issue in String.encodeUTF8_UTF16

Resolved

JDK-8280095 Loop optimization issue in String.encodeUTF8_UTF16

Resolved

JDK-8280636 Loop optimization issue in String.encodeUTF8_UTF16

Resolved

JDK-8280701 Loop optimization issue in String.encodeUTF8_UTF16

Resolved

JDK-8279985 Loop optimization issue in String.encodeUTF8_UTF16

Closed

JDK-8280039 Loop optimization issue in String.encodeUTF8_UTF16

Closed

relates to

JDK-8054307 JEP 254: Compact Strings

Closed

JDK-8279888 Local variable independently used by multiple loops can interfere with loop optimizations

Resolved

links to

Commit openjdk/jdk11u-dev/84ed9671

Commit openjdk/jdk17u-dev/69d296d4

Commit openjdk/jdk18/ff856593

Commit openjdk/jdk/c3d0a940

Review openjdk/jdk11u-dev/791

Review openjdk/jdk17u-dev/97

Review openjdk/jdk18/99

Review openjdk/jdk/7026

(1 backported by, 2 relates to, 8 links to)

Assignee:: Claes Redestad
Reporter:: Claes Redestad
Votes:: 0 Vote for this issue
Watchers:: 6 Start watching this issue

Created:: 2022-01-11 01:26
Updated:: 2022-04-11 16:06
Resolved:: 2022-01-11 06:49

Details

Backports

Description

Attachments

Issue Links

Activity

People

Dates