Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8248516

some newly added locale cannot parse uppercased date string.

XMLWordPrintable

    • Icon: CSR CSR
    • Resolution: Withdrawn
    • Icon: P3 P3
    • tbd
    • core-libs
    • None
    • behavioral
    • low
    • Hide
      Applications that expect the current behavior would break with those supplementary code points that have case-mappings. This should have been the way when supplementary character support was introduced in the JDK. Thus even this is technically an incompatibility, very few applications would be expected to complain about it.
      Show
      Applications that expect the current behavior would break with those supplementary code points that have case-mappings. This should have been the way when supplementary character support was introduced in the JDK. Thus even this is technically an incompatibility, very few applications would be expected to complain about it.
    • Java API
    • SE

      Summary

      Date/Time names with supplementary characters cannot be parsed in a case-insensitive manner.

      Problem

      JDK15 added a new locale "ff-Adlm-LR", which has locale data, such as month/day names in Adlam script, which is encoded in a supplementary character plane. java.text.DateFormat parses those names in a case-insensitive manner, but it throws an exception because underlying String.regionMatches(ignoreCase == true) fails for supplementary characters, such that:

      "\ud83a\udd2e".regionMatches(true, 0, "\ud83a\udd0c", 0, 2)

      Returns false. where:

      "\ud83a\udd2e" == 'ADLAM SMALL LETTER O' (U+1E92E)
      "\ud83a\udd0c" == 'ADLAM CAPITAL LETTER O' (U+1E90C)

      despite that:

      "\ud83a\udd2e".toUpperCase(Locale.ROOT).equals("\ud83a\udd0c")
      Character.toUpperCase(0x1e92e) == 0x1e90c

      each statement returns true.

      Solution

      Change those specs for String.regionMatches(boolean,...), String.equalsIgnoreCase(), and String.compareToIgnoreCase() to perform "code point" comparison in case for supplementary characters. Characters in Basic Multilingual Plane (<= \uFFFF) are continued to be compared with code units got from charAt() method.

      Although this change will alter the semantics in traversing the string to compare, the rationale to change it is that these String methods should consistently behave across characters (code points) whether they are in Basic Multilingual Plane or not. There should be no reason to exclude supplementary characters from comparing strings in a case-insensitive manner.

      Specification

      Append the following sentence just after the last list item of conditions in the method description of String.regionMatches(boolean, ...) method.

      * In case that both <i>toffset+k</i> and <i>ooffset+k</i> point to
      * supplementary characters, that is <i>k</i> point to high surrogates
      * and <i>k+1</i> point to low surrogates, {@code codePointAt()} is
      * used to retrieve the code points in place for {@code charAt()} method,
      * and <i>k+1</i> is excluded from the above condition. If they point
      * to an unpaired high or low surrogates, they are compared using
      * {@code charAt()} method.

      Change the following list item of conditions in the method description of String.equalsIgnoreCase() method from:

      *   <li> Calling {@code Character.toLowerCase(Character.toUpperCase(char))}
      *        on each character produces the same result

      to:

      *   <li> Calling {@code Character.toLowerCase(Character.toUpperCase(int))}
      *        on each code point produces the same result

      Change the following description in the method description of String.compareToIgnoreCase() method from:

      * {@code Character.toLowerCase(Character.toUpperCase(character))} on
      * each character.

      to:

      * {@code Character.toLowerCase(Character.toUpperCase(int))} on
      * each code point of the character.

            naoto Naoto Sato
            webbuggrp Webbug Group
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: