Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8215944

Square character support for the Japanese new era (11uX Backport)

    • Icon: CSR CSR
    • Resolution: Withdrawn
    • Icon: P3 P3
    • 11-pool
    • core-libs
    • None
    • behavioral
    • minimal
    • Java API
    • SE

      Summary

      Support for the square character for the Japanese new era

      Problem

      A new code point (U+32FF) will be assigned for the upcoming Japanese new era [1]. Currently the code point in JDK 11 implementation is unassigned, thus various static methods in java.lang.Character class returns values for an invalid code point.

      [1] http://blog.unicode.org/2018/09/new-japanese-era.html

      The Java SE 11 specification currently states that the java.lang.Character class derives data from the Unicode Standard, version 10.0.0. To help industry and end users, implementation and normative, exceptions are requested so that this one important new code point (U+32FF) can be defined in future JDK 11u Updates.

      Solution

      Introduce implementation change to update the U+32FF code point.

      Modify the character properties for that particular code point so that it would be regarded as a Japanese square era name, similar to the one that represents Meizi (U+337E).

      Specifically,

       Character.isDefined(0x32FF) returns 'true'.
       Character.getType(0x32FF) returns Character.OTHER_SYMBOL
       Character.getName(0x32FF) returns "SQUARE ERA NAME NEWERA"
       Character.getDirectionality(0x32FF) returns Character.DIRECTIONALITY_LEFT_TO_RIGHT
       Character.UnicodeScript(0x32FF) returns Character.UnicodeScript.COMMON

      This change will not include any composition/decomposition changes. It will be included in the upcoming ICU4J changes in their version 63.

      This CSR covers implementation change to update the U+32FF code point so that it would be regarded as a Japanese square era name. It is a follow-on CSR for JDK-8215945 that covers normative spec change for adding a paragraph to java.lang.Character class to indicate the behavioral change for "U+32FF" code point.

      Specification

      Change "make/tools/UnicodeData/UnicodeData.txt" to modify the properties of code point "U+32FF" so that it would be regarded as a Japanese square era name.

      Below is the section of hg diff:

       32FE;CIRCLED KATAKANA WO;So;0;L;<circle> 30F2;;;;N;;;;;
       +32FF;SQUARE ERA NAME NEWERA;So;0;L;<square> 5143 53F7;;;;N;SQUARED TWO IDEOGRAPHS ERA NAME NEWERA;;;;
       3300;SQUARE APAATO;So;0;L;<square> 30A2 30D1 30FC 30C8;;;;N;SQUARED APAATO;;;;

      Because of above change various static methods in java.lang.Character class returns properties of "U+32FF" code point. This is behavioral change, but no spec change is anticipated.

      Please note that since the properties of code point "U+32FF" are modified, there will be a JCK failure.

      Below are the list of tests:

       api/java_lang/Character/UnicodeScript/indexTGF.html#UnicodeScriptTests
       api/java_lang/Character/index.html#attributesFullRange
       api/java_lang/Character/index.html#getAttrFullRange
       api/java_lang/Character/index.html#Methods

      To address above JCK failures, there is separate CSR JDK-8215945 to cover normative spec change for adding a paragraph to java.lang.Character class to indicate the behavioral change for "U+32FF" code point.

            dkejriwal Deepak Kejriwal (Inactive)
            naoto Naoto Sato
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: