Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8301557

Allow additional characters for GB18030-2022 support

    XMLWordPrintable

Details

    • behavioral
    • minimal
    • Hide
      The risk is minimal as this CSR simply *allows* those code points, keeping the existing code points intact. Also the change prohibits the new code points being the start/part of the Java identifiers so that the binary compatibility will be kept, as we did with the Japanese Era/New currency symbol characters addition.
      Show
      The risk is minimal as this CSR simply *allows* those code points, keeping the existing code points intact. Also the change prohibits the new code points being the start/part of the Java identifiers so that the binary compatibility will be kept, as we did with the Japanese Era/New currency symbol characters addition.
    • Java API
    • SE

    Description

      Summary

      Allow additional code points to support GB18030-2022 from beyond Unicode 6.2 which Java SE 8 is based upon.

      Problem

      China National Standard body (CESI) has recently published GB18030-2022 which is an updated version of the GB18030 standard and brings GB18030 in sync with Unicode version 11.0. Since Java SE 8 supports characters defined in Unicode 6.2, some characters defined in the new GB18030 standard cannot be represented.

      Solution

      Allow code points that are required by the Implementation Level 1 definition in the GB18030-2022 standard. Additionally required code points are in the range of U+9FCD to U+9FEF, totaling 35 code points.

      Specification

      Modify the second paragraph in the Unicode Conformance section in the class description of java.lang.Character class as follows:

      diff a/jdk/src/share/classes/java/lang/Character.java b/jdk/src/share/classes/java/lang/Character.java
      --- a/jdk/src/share/classes/java/lang/Character.java
      +++ b/jdk/src/share/classes/java/lang/Character.java
      @@ -50,17 +50,21 @@
        * assigned Unicode code point or character range. The file is available
        * from the Unicode Consortium at
        * <a href="http://www.unicode.org">http://www.unicode.org</a>.
        * <p>
        * The Java SE 8 Platform uses character information from version 6.2
      - * of the Unicode Standard, with two extensions. First, the Java SE 8 Platform
      - * allows an implementation of class {@code Character} to use the Japanese Era
      - * code point, {@code U+32FF}, from the first version of the Unicode Standard
      - * after 6.2 that assigns the code point. Second, in recognition of the fact
      + * of the Unicode Standard, with three extensions. First, in recognition of the fact
        * that new currencies appear frequently, the Java SE 8 Platform allows an
        * implementation of class {@code Character} to use the Currency Symbols
      - * block from version 10.0 of the Unicode Standard. Consequently, the
      + * block from version 10.0 of the Unicode Standard. Second, the Java SE 8 Platform
      + * allows an implementation of class {@code Character} to use the code points
      + * in the range of {@code U+9FCD} to {@code U+9FEF} from version 11.0 of the
      + * Unicode Standard, in order for the class to allow the "Implementation
      + * Level 1" of the Chinese GB18030-2022 standard. Third, the Java SE 8 Platform
      + * allows an implementation of class {@code Character} to use the Japanese Era
      + * code point, {@code U+32FF}, from the Unicode Standard version 12.1.
      + * Consequently, the
        * behavior of fields and methods of class {@code Character} may vary across
        * implementations of the Java SE 8 Platform when processing the aforementioned
        * code points ( outside of version 6.2 ), except for the following methods
        * that define Java identifiers:
        * {@link #isJavaIdentifierStart(int)}, {@link #isJavaIdentifierStart(char)},

      Attachments

        Issue Links

          Activity

            People

              naoto Naoto Sato
              naoto Naoto Sato
              Alan Bateman, Lance Andersen
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: