-
Bug
-
Resolution: Fixed
-
P3
-
8
-
b14
-
generic
-
generic
-
Verified
Issue | Fix Version | Assignee | Priority | Status | Resolution | Resolved In Build |
---|---|---|---|---|---|---|
JDK-8045661 | 8u25 | Naoto Sato | P3 | Resolved | Fixed | b01 |
JDK-8042814 | 8u20 | Naoto Sato | P3 | Resolved | Fixed | b17 |
JDK-8052712 | emb-8u26 | Naoto Sato | P3 | Resolved | Fixed | b17 |
JDK-8072250 | 7u85 | Naoto Sato | P3 | Resolved | Fixed | b01 |
JDK-8043910 | 7u80 | Naoto Sato | P3 | Resolved | Fixed | b03 |
JDK-8065284 | 7u79 | Naoto Sato | P3 | Resolved | Fixed | b01 |
JDK-8065141 | 7u76 | Naoto Sato | P3 | Closed | Fixed | b09 |
The text "String.toLowerCase incorrectly increases length" makes the assumption that this is a problem, but of course it isn't: The documentation specifically says "Since case mappings are not always 1:1 char mappings, the resulting String may be a different length than the original String."
I look at http://www.unicode.org/Public/UNIDATA/SpecialCasing.txt and see:
# Preserve canonical equivalence for I with dot. Turkic is handled below.
0130; 0069 0307; 0130; 0130; # LATIN CAPITAL LETTER I WITH DOT ABOVE
My understanding of this is that in all locales *except* the ones handled specially (which are 'az', 'lt', and 'tr') we should bi-directionally convert "\u0130" <-> "\u0069\u0307".
I.e. lowercasing "\u0130" should result in "\u0069\u0307";
converting "\u0069\u0307" to uppercase or titlecase should yield "\u0130".
Note this allows round-trip conversions, which is why it is specified this way.
Java 7 correctly does the former conversion, but not the latter.
Java 8 does neither.
- backported by
-
JDK-8042814 String.toLowerCase regression - violates Unicode standard
- Resolved
-
JDK-8043910 String.toLowerCase regression - violates Unicode standard
- Resolved
-
JDK-8045661 String.toLowerCase regression - violates Unicode standard
- Resolved
-
JDK-8052712 String.toLowerCase regression - violates Unicode standard
- Resolved
-
JDK-8065284 String.toLowerCase regression - violates Unicode standard
- Resolved
-
JDK-8072250 String.toLowerCase regression - violates Unicode standard
- Resolved
-
JDK-8065141 String.toLowerCase regression - violates Unicode standard
- Closed
- blocks
-
JDK-8030201 Nashorn: String.prototype.toLowerCase() requires SpecialCasing support
- Closed
- duplicates
-
JDK-8041387 Applets not working when the preffered language is Turkish
- Closed
- relates to
-
JDK-8043186 javac test langtools/tools/javac/util/StringUtilsTest.java fails
- Closed
-
JDK-8049038 In turkish locale, String.equalsIgnoreCase() returns "true" for character \u0130 and \u0131.
- Closed
-
JDK-6404304 RFE: Unicode 5.1 support
- Closed
-
JDK-8020037 String.toLowerCase incorrectly increases length, if string contains \u0130 char
- Closed