-
Enhancement
-
Resolution: Fixed
-
P4
-
None
-
b08
-
generic
-
generic
java.lang.String.regionMatches(ignoreCase == true, ...)
java.lang.String.equalsIgnoreCase()
java.lang.String.compareToIgnoreCase()
These methods are supposed to match/compare strings in a case-insensitive manner. However, their specs and implementations are char based, which cannot handle supplementary characters correctly. For example,
"\ud83a\udd2e".regionMatches(true, 0, "\ud83a\udd0c", 0, 2)
Returns false (conforming to the existing spec), although "\ud83a\udd2e" is the 'ADLAM SMALL LETTER O' character which has the code point U+1E92E, and "\ud83a\udd0c" is the 'ADLAM CAPITAL LETTER O' character which has the code point U+1E90C. Thus it should return true if it is true to the meaning of "ignore case." This behavior contradicts to the fact that:
"\ud83a\udd2e".toUpperCase(Locale.ROOT).equals("\ud83a\udd0c")
Character.toUpperCase(0x1e92e) == 0x1e90c
each statement returns true.
Both the spec and its implementation need to be modified.
- csr for
-
JDK-8248664 Support supplementary characters in String case insensitive operations
- Closed
- relates to
-
JDK-8249718 Refine discussion of code point handling in java.lang.String
- Open
-
JDK-8264544 Case-insensitive comparison issue with supplementary characters.
- Resolved
-
JDK-8248434 some newly added locale cannot parse uppercased date string.
- Closed
-
JDK-8253058 Case insensitive regexes for supplementary characters
- Closed
-
JDK-8253059 Case insensitive collators for supplementary characters
- Closed
-
JDK-8264545 Refactor case-insensitive comparison for supplementary characters
- Open