Issue | Fix Version | Assignee | Priority | Status | Resolution | Resolved In Build |
---|---|---|---|---|---|---|
JDK-8348753 | 21.0.8-oracle | Johny Jose | P4 | Open | Unresolved | |
JDK-8348754 | 17.0.16-oracle | Johny Jose | P4 | Open | Unresolved | |
JDK-8348755 | 11.0.28-oracle | Johny Jose | P4 | Open | Unresolved | |
JDK-8348756 | 8u461 | Johny Jose | P4 | Open | Unresolved |
A race condition in the String constructor taking a char[] (and probably other constructors too) allows creating a String with an incorrect coder: A String only containing latin-1 characters, but still encoded using UTF-16.
This is because in between the constructor checking if the content can be encoded using latin-1 and it being encoded as UTF-16, the content of the passed in array may have changed
See https://wouter.coekaerts.be/2023/breaking-string
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Concurrently modify the char[] passed into the String constructor. See example code.
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
A String where .equals and other methods behave correctly
ACTUAL -
A String where .equals and other methods are inconsistent with its contents
---------- BEGIN SOURCE ----------
/**
* Given a latin-1 String, creates a copy that is
* incorrectly encoded as UTF-16.
*/
static String breakIt(String original) {
if (original.chars().max().orElseThrow() > 256) {
throw new IllegalArgumentException(
"Can only break latin-1 Strings");
}
char[] chars = original.toCharArray();
// in another thread, flip the first character back
// and forth between being encodable as latin-1 or not
Thread thread = new Thread(() -> {
while (!Thread.interrupted()) {
chars[0] ^= 256;
}
});
thread.start();
// at the same time call the String constructor,
// until we hit the race condition
while (true) {
String s = new String(chars);
if (s.charAt(0) < 256 && !original.equals(s)) {
thread.interrupt();
return s;
}
}
}
String a = "foo";
String b = breakIt(a);
// they are not equal to each other
System.out.println(a.equals(b));
// => false
// they do contain the same series of characters
System.out.println(Arrays.equals(a.toCharArray(),
b.toCharArray()));
// => true
---------- END SOURCE ----------
FREQUENCY : always
- backported by
-
JDK-8348753 Improve robustness of String constructors with mutable array inputs
-
- Open
-
-
JDK-8348754 Improve robustness of String constructors with mutable array inputs
-
- Open
-
-
JDK-8348755 Improve robustness of String constructors with mutable array inputs
-
- Open
-
-
JDK-8348756 Improve robustness of String constructors with mutable array inputs
-
- Open
-
- csr for
-
JDK-8319228 Improve robustness of String constructors with mutable array inputs
-
- Closed
-
- relates to
-
JDK-8325737 Release Note: `Files.readString` May Return Incorrect String When Using UTF-16 or Other Charsets
-
- Resolved
-
-
JDK-8325590 Regression in round-tripping UTF-16 strings after JDK-8311906
-
- Closed
-
-
JDK-8321514 UTF16 string gets constructed incorrectly from codepoints if CompactStrings is not enabled
-
- Closed
-
-
JDK-8321180 Condition for non-latin1 string size too large exception is off by one
-
- Resolved
-