Loading...

XML

Word

Printable

Type: CSR
Resolution: Withdrawn
Priority: P3
Fix Version/s: tbd
Component/s: core-libs
Labels:
None

Subcomponent:
java.lang
Compatibility Kind:

behavioral
Compatibility Risk:
low
Compatibility Risk Description:

Hide
Applications that expect the current behavior would break with those supplementary code points that have case-mappings. This should have been the way when supplementary character support was introduced in the JDK. Thus even this is technically an incompatibility, very few applications would be expected to complain about it.

Show
Applications that expect the current behavior would break with those supplementary code points that have case-mappings. This should have been the way when supplementary character support was introduced in the JDK. Thus even this is technically an incompatibility, very few applications would be expected to complain about it.
Interface Kind:

Java API
Scope:
SE

Summary

Date/Time names with supplementary characters cannot be parsed in a case-insensitive manner.

Problem

JDK15 added a new locale "ff-Adlm-LR", which has locale data, such as month/day names in Adlam script, which is encoded in a supplementary character plane. java.text.DateFormat parses those names in a case-insensitive manner, but it throws an exception because underlying String.regionMatches(ignoreCase == true) fails for supplementary characters, such that:

"\ud83a\udd2e".regionMatches(true, 0, "\ud83a\udd0c", 0, 2)

Returns false. where:

"\ud83a\udd2e" == 'ADLAM SMALL LETTER O' (U+1E92E)
"\ud83a\udd0c" == 'ADLAM CAPITAL LETTER O' (U+1E90C)

despite that:

"\ud83a\udd2e".toUpperCase(Locale.ROOT).equals("\ud83a\udd0c")
Character.toUpperCase(0x1e92e) == 0x1e90c

each statement returns true.

Solution

Change those specs for String.regionMatches(boolean,...), String.equalsIgnoreCase(), and String.compareToIgnoreCase() to perform "code point" comparison in case for supplementary characters. Characters in Basic Multilingual Plane (<= \uFFFF) are continued to be compared with code units got from charAt() method.

Although this change will alter the semantics in traversing the string to compare, the rationale to change it is that these String methods should consistently behave across characters (code points) whether they are in Basic Multilingual Plane or not. There should be no reason to exclude supplementary characters from comparing strings in a case-insensitive manner.

Specification

Append the following sentence just after the last list item of conditions in the method description of String.regionMatches(boolean, ...) method.

* In case that both <i>toffset+k</i> and <i>ooffset+k</i> point to
* supplementary characters, that is <i>k</i> point to high surrogates
* and <i>k+1</i> point to low surrogates, {@code codePointAt()} is
* used to retrieve the code points in place for {@code charAt()} method,
* and <i>k+1</i> is excluded from the above condition. If they point
* to an unpaired high or low surrogates, they are compared using
* {@code charAt()} method.

Change the following list item of conditions in the method description of String.equalsIgnoreCase() method from:

*   <li> Calling {@code Character.toLowerCase(Character.toUpperCase(char))}
*        on each character produces the same result

to:

*   <li> Calling {@code Character.toLowerCase(Character.toUpperCase(int))}
*        on each code point produces the same result

Change the following description in the method description of String.compareToIgnoreCase() method from:

* {@code Character.toLowerCase(Character.toUpperCase(character))} on
* each character.

to:

* {@code Character.toLowerCase(Character.toUpperCase(int))} on
* each code point of the character.

csr of

JDK-8248434 some newly added locale cannot parse uppercased date string.

Closed

Assignee:: Naoto Sato

Reporter:: Webbug Group

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Created:: 2020-06-29 16:35

Updated:: 2020-07-01 11:23

Resolved:: 2020-07-01 11:23

Details

Description

Summary

Problem

Solution

Specification

Attachments

Issue Links

Activity

People

Dates