Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8339568

Update Unicode Data Files to 16.0.0

XMLWordPrintable

    • Icon: CSR CSR
    • Resolution: Approved
    • Icon: P3 P3
    • 24
    • core-libs
    • None
    • source
    • minimal
    • The risk is minimal, as they are simply new assignments of code points, new scripts, and new blocks.
    • Java API
    • SE

      Summary

      Support the Unicode Standard version 16.0 in the JDK.

      Problem

      Keeping up with the latest Unicode Standard is imperative. Otherwise, interoperability with other platforms would be problematic.

      Solution

      Incorporate Unicode 16.0 which added 5185 characters, 7 new scripts and 10 new blocks since Unicode 15.1.0. Detailed changes are described on the Unicode Consortium's 16.0 website.

      java.text.Bidi and java.text.Normalizer classes will be upgraded to 16.0 level of Unicode Annex #9 and #15, respectively.

      Support for the Unicode extended grapheme clusters in java.util.regex.Pattern will be upgraded to 16.0 level of the Unicode Annex #29 "Unicode Text Segmentation."

      For more specific delta charts, refer to Unicode.org's delta page.

      Specification

      Change the class description in the java.lang.Character class as:

      @@ -61,11 +61,11 @@
         * This file specifies properties including name and category for every
         * assigned Unicode code point or character range. The file is available
         * from the Unicode Consortium at
         * <a href="http://www.unicode.org">http://www.unicode.org</a>.
         * <p>
      -  * Character information is based on the Unicode Standard, version 15.1.
      +  * Character information is based on the Unicode Standard, version 16.0.
         * <p>
         * The Java platform has supported different versions of the Unicode
         * Standard over time. Upgrades to newer versions of the Unicode Standard
         * occurred in the following Java releases, each indicating the new version:
         * <table class="striped">
      @@ -73,10 +73,12 @@
         * <thead>
         * <tr><th scope="col">Java release</th>
         *     <th scope="col">Unicode version</th></tr>
         * </thead>
         * <tbody>
      +  * <tr><th scope="row" style="text-align:left">Java SE 24</th>
      +  *     <td>Unicode 16.0</td></tr>
         * <tr><th scope="row" style="text-align:left">Java SE 22</th>
         *     <td>Unicode 15.1</td></tr>
         * <tr><th scope="row" style="text-align:left">Java SE 20</th>
         *     <td>Unicode 15.0</td></tr>
         * <tr><th scope="row" style="text-align:left">Java SE 19</th>

      In java.lang.Character.UnicodeBlock class, add the following new fields:

      +         /**
      +          * Constant for the "Todhri" Unicode
      +          * character block.
      +          * @since 24
      +          */
      +         public static final UnicodeBlock TODHRI =
      +                 new UnicodeBlock("TODHRI");
      + 
      +         /**
      +          * Constant for the "Garay" Unicode
      +          * character block.
      +          * @since 24
      +          */
      +         public static final UnicodeBlock GARAY =
      +                 new UnicodeBlock("GARAY");
      + 
      +         /**
      +          * Constant for the "Tulu-Tigalari" Unicode
      +          * character block.
      +          * @since 24
      +          */
      +         public static final UnicodeBlock TULU_TIGALARI =
      +                 new UnicodeBlock("TULU_TIGALARI",
      +                         "TULU-TIGALARI");
      + 
      +         /**
      +          * Constant for the "Myanmar Extended-C" Unicode
      +          * character block.
      +          * @since 24
      +          */
      +         public static final UnicodeBlock MYANMAR_EXTENDED_C =
      +                 new UnicodeBlock("MYANMAR_EXTENDED_C",
      +                         "MYANMAR EXTENDED-C",
      +                         "MYANMAREXTENDED-C");
      + 
      +         /**
      +          * Constant for the "Sunuwar" Unicode
      +          * character block.
      +          * @since 24
      +          */
      +         public static final UnicodeBlock SUNUWAR =
      +                 new UnicodeBlock("SUNUWAR");
      + 
      +         /**
      +          * Constant for the "Egyptian Hieroglyphs Extended-A" Unicode
      +          * character block.
      +          * @since 24
      +          */
      +         public static final UnicodeBlock EGYPTIAN_HIEROGLYPHS_EXTENDED_A =
      +                 new UnicodeBlock("EGYPTIAN_HIEROGLYPHS_EXTENDED_A",
      +                         "EGYPTIAN HIEROGLYPHS EXTENDED-A",
      +                         "EGYPTIANHIEROGLYPHSEXTENDED-A");
      + 
      +         /**
      +          * Constant for the "Gurung Khema" Unicode
      +          * character block.
      +          * @since 24
      +          */
      +         public static final UnicodeBlock GURUNG_KHEMA =
      +                 new UnicodeBlock("GURUNG_KHEMA",
      +                         "GURUNG KHEMA",
      +                         "GURUNGKHEMA");
      + 
      +         /**
      +          * Constant for the "Kirat Rai" Unicode
      +          * character block.
      +          * @since 24
      +          */
      +         public static final UnicodeBlock KIRAT_RAI =
      +                 new UnicodeBlock("KIRAT_RAI",
      +                         "KIRAT RAI",
      +                         "KIRATRAI");
      + 
      +         /**
      +          * Constant for the "Symbols for Legacy Computing Supplement" Unicode
      +          * character block.
      +          * @since 24
      +          */
      +         public static final UnicodeBlock SYMBOLS_FOR_LEGACY_COMPUTING_SUPPLEMENT =
      +                 new UnicodeBlock("SYMBOLS_FOR_LEGACY_COMPUTING_SUPPLEMENT",
      +                         "SYMBOLS FOR LEGACY COMPUTING SUPPLEMENT",
      +                         "SYMBOLSFORLEGACYCOMPUTINGSUPPLEMENT");
      + 
      +         /**
      +          * Constant for the "Ol Onal" Unicode
      +          * character block.
      +          * @since 24
      +          */
      +         public static final UnicodeBlock OL_ONAL =
      +                 new UnicodeBlock("OL_ONAL",
      +                         "OL ONAL",
      +                         "OLONAL");
      + 

      In java.lang.Character.UnicodeScript class, add the following new fields:

      +         /**
      +          * Unicode script "Todhri".
      +          * @since 24
      +          */
      +         TODHRI,
      + 
      +         /**
      +          * Unicode script "Garay".
      +          * @since 24
      +          */
      +         GARAY,
      + 
      +         /**
      +          * Unicode script "Tulu Tigalari".
      +          * @since 24
      +          */
      +         TULU_TIGALARI,
      + 
      +         /**
      +          * Unicode script "Sunuwar".
      +          * @since 24
      +          */
      +         SUNUWAR,
      + 
      +         /**
      +          * Unicode script "Gurung Khema".
      +          * @since 24
      +          */
      +         GURUNG_KHEMA,
      + 
      +         /**
      +          * Unicode script "Kirat Rai".
      +          * @since 24
      +          */
      +         KIRAT_RAI,
      + 
      +         /**
      +          * Unicode script "Ol Onal".
      +          * @since 24
      +          */
      +         OL_ONAL,

            naoto Naoto Sato
            naoto Naoto Sato
            Justin Lu
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: