- 
    Bug 
- 
    Resolution: Fixed
- 
     P3 P3
- 
    9, 11, 17, 21, 22
- 
        b27
- 
        generic
- 
        generic
- 
        Verified
| Issue | Fix Version | Assignee | Priority | Status | Resolution | Resolved In Build | 
|---|---|---|---|---|---|---|
| JDK-8324700 | 21.0.4-oracle | Weibing Xiao | P3 | Resolved | Fixed | b01 | 
| JDK-8327855 | 21.0.4 | Sonia Zaldana Calles | P3 | Resolved | Fixed | b01 | 
| JDK-8324699 | 17.0.12-oracle | Weibing Xiao | P3 | Resolved | Fixed | b01 | 
| JDK-8329977 | 17.0.12 | Amos SHI | P3 | Resolved | Fixed | b01 | 
| JDK-8324698 | 11.0.25-oracle | Weibing Xiao | P3 | Resolved | Fixed | b01 | 
| JDK-8334451 | 11.0.25 | Martin Doerr | P3 | Resolved | Fixed | b01 | 
Generic. reproduced in both Linux and MacOS (M1)
A DESCRIPTION OF THE PROBLEM :
A regression is found in Java9+ creating String instance from UTF8 bytes, a side effect of string compactation https://openjdk.org/jeps/254 that changed the decoding logic. Specifically, when constructing a string from bytes:
```
String str = new String(largeBytes, StandardCharsets.UTF_8);
```
if the size of largeBytes is greater than 2^30 (>1 GB) but smaller than INT_MAX (2 GB), it fails on Java9+ (including 11, 17, 21, though the stack trace is slightly different, see below), regardless of jvm heap size. In Java8, it succeeded when jvm heap size is set to be sufficient.
REGRESSION : Last worked in version 8u391
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
// largeBytes is a byte array of size ~1.2 GB and contains encoded non-ascii character
String str = new String(largeBytes, StandardCharsets.UTF_8);
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
The string successfully constructed, if run with sufficient jvm heap size (java -Xms5G -Xmx8G)
ACTUAL -
Java8:
```
$ java -Xms5G -Xmx8G org/example/Main
(succeeded)
```
Java11 (regardless of heap size):
```
$ java org/example/Main
Exception in thread "main" java.lang.NegativeArraySizeException: -1894967266
at java.base/java.lang.StringCoding.decodeUTF8_0(StringCoding.java:777)
at java.base/java.lang.StringCoding.decodeUTF8(StringCoding.java:734)
at java.base/java.lang.StringCoding.decode(StringCoding.java:257)
at java.base/java.lang.String.<init>(String.java:507)
at java.base/java.lang.String.<init>(String.java:561)
at org.example.Main.main(Main.java:28)
```
Java17 (regardless of heap size):
```
$ java org/example/Main
Exception in thread "main" java.lang.NegativeArraySizeException: -1894967266
at java.base/java.lang.String.<init>(String.java:568)
at java.base/java.lang.String.<init>(String.java:1387)
at org.example.Main.main(Main.java:28)
```
Java21 (regardless of heap size):
```
$ java org/example/Main
Exception in thread "main" java.lang.NegativeArraySizeException: -1894967266
at java.base/java.lang.String.<init>(String.java:577)
at java.base/java.lang.String.<init>(String.java:1425)
at org.example.Main.main(Main.java:28)
```
Java8, default heap size:
```
$ java org/example/Main
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.lang.StringCoding.decode(StringCoding.java:215)
at java.lang.String.<init>(String.java:463)
at java.lang.String.<init>(String.java:515)
at org.example.Main.main(Main.java:28)
```
---------- BEGIN SOURCE ----------
https://gist.github.com/Abacn/e8fda767f53e723db6d71f21f4db2187
---------- END SOURCE ----------
CUSTOMER SUBMITTED WORKAROUND :
Probably work: split the byte array and construct two string, then add them up
FREQUENCY : always
- backported by
- 
                    JDK-8324698 NegativeArraySizeException decoding >1G UTF8 bytes with non-ascii characters -           
- Resolved
 
-         
- 
                    JDK-8324699 NegativeArraySizeException decoding >1G UTF8 bytes with non-ascii characters -           
- Resolved
 
-         
- 
                    JDK-8324700 NegativeArraySizeException decoding >1G UTF8 bytes with non-ascii characters -           
- Resolved
 
-         
- 
                    JDK-8327855 NegativeArraySizeException decoding >1G UTF8 bytes with non-ascii characters -           
- Resolved
 
-         
- 
                    JDK-8329977 NegativeArraySizeException decoding >1G UTF8 bytes with non-ascii characters -           
- Resolved
 
-         
- 
                    JDK-8334451 NegativeArraySizeException decoding >1G UTF8 bytes with non-ascii characters -           
- Resolved
 
-         
- links to
- 
                     Commit
        openjdk/jdk11u-dev/ef08f4eb Commit
        openjdk/jdk11u-dev/ef08f4eb
- 
                     Commit
        openjdk/jdk17u-dev/fc01ffe9 Commit
        openjdk/jdk17u-dev/fc01ffe9
- 
                     Commit
        openjdk/jdk21u-dev/779204c9 Commit
        openjdk/jdk21u-dev/779204c9
- 
                     Commit
        openjdk/jdk/82796bde Commit
        openjdk/jdk/82796bde
- 
                     Review
        openjdk/jdk11u-dev/2783 Review
        openjdk/jdk11u-dev/2783
- 
                     Review
        openjdk/jdk17u-dev/2279 Review
        openjdk/jdk17u-dev/2279
- 
                     Review
        openjdk/jdk21u-dev/332 Review
        openjdk/jdk21u-dev/332
- 
                     Review
        openjdk/jdk21u-dev/344 Review
        openjdk/jdk21u-dev/344
- 
                     Review
        openjdk/jdk/16974 Review
        openjdk/jdk/16974