Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8367272

Update Unicode Data Files to 17.0.0

XMLWordPrintable

    • Icon: CSR CSR
    • Resolution: Unresolved
    • Icon: P3 P3
    • 26
    • core-libs
    • None
    • source, behavioral
    • low
    • The risk is low, as they are simply new assignments of code points, new scripts, and new blocks.
    • Java API
    • SE

      Summary

      Support the Unicode Standard version 17.0 in the JDK.

      Problem

      Keeping up with the latest Unicode Standard is imperative. Otherwise, interoperability with other platforms would be problematic.

      Solution

      Incorporate Unicode 17.0 which added 4803 characters, 4 new scripts and 8 new blocks since Unicode 16.0. Detailed changes are described on the Unicode Consortium's 17.0 website.

      java.text.Bidi and java.text.Normalizer classes will be upgraded to 17.0 level of Unicode Annex #9 and #15, respectively.

      Support for the Unicode extended grapheme clusters in java.util.regex.Pattern will be upgraded to 17.0 level of the Unicode Annex #29 "Unicode Text Segmentation."

      For more specific delta charts, refer to Unicode.org's delta page.

      Specification

      Change the class description in the java.lang.Character class as:

      @@ -61,11 +61,11 @@
         * This file specifies properties including name and category for every
         * assigned Unicode code point or character range. The file is available
         * from the Unicode Consortium at
         * <a href="http://www.unicode.org">http://www.unicode.org</a>.
         * <p>
      -  * Character information is based on the Unicode Standard, version 16.0.
      +  * Character information is based on the Unicode Standard, version 17.0.
         * <p>
         * The Java platform has supported different versions of the Unicode
         * Standard over time. Upgrades to newer versions of the Unicode Standard
         * occurred in the following Java releases, each indicating the new version:
         * <table class="striped">
      @@ -73,10 +73,12 @@
         * <thead>
         * <tr><th scope="col">Java release</th>
         *     <th scope="col">Unicode version</th></tr>
         * </thead>
         * <tbody>
      +  * <tr><th scope="row" style="text-align:left">Java SE 26</th>
      +  *     <td>Unicode 17.0</td></tr>
         * <tr><th scope="row" style="text-align:left">Java SE 24</th>
         *     <td>Unicode 16.0</td></tr>
         * <tr><th scope="row" style="text-align:left">Java SE 22</th>
         *     <td>Unicode 15.1</td></tr>
         * <tr><th scope="row" style="text-align:left">Java SE 20</th>

      In java.lang.Character.UnicodeBlock class, add the following new fields:

      +         /**
      +          * Constant for the "Sidetic" Unicode
      +          * character block.
      +          * @since 26
      +          */
      +         public static final UnicodeBlock SIDETIC =
      +             new UnicodeBlock("SIDETIC");
      + 
      +         /**
      +          * Constant for the "Sharada Supplement" Unicode
      +          * character block.
      +          * @since 26
      +          */
      +         public static final UnicodeBlock SHARADA_SUPPLEMENT =
      +             new UnicodeBlock("SHARADA_SUPPLEMENT",
      +                 "SHARADA SUPPLEMENT",
      +                 "SHARADASUPPLEMENT");
      + 
      +         /**
      +          * Constant for the "Tolong Siki" Unicode
      +          * character block.
      +          * @since 26
      +          */
      +         public static final UnicodeBlock TOLONG_SIKI =
      +             new UnicodeBlock("TOLONG_SIKI",
      +                 "TOLONG SIKI",
      +                 "TOLONGSIKI");
      + 
      +         /**
      +          * Constant for the "Beria Erfe" Unicode
      +          * character block.
      +          * @since 26
      +          */
      +         public static final UnicodeBlock BERIA_ERFE =
      +             new UnicodeBlock("BERIA_ERFE",
      +                 "BERIA ERFE",
      +                 "BERIAERFE");
      + 
      +         /**
      +          * Constant for the "Tangut Components Supplement" Unicode
      +          * character block.
      +          * @since 26
      +          */
      +         public static final UnicodeBlock TANGUT_COMPONENTS_SUPPLEMENT =
      +             new UnicodeBlock("TANGUT_COMPONENTS_SUPPLEMENT",
      +                 "TANGUT COMPONENTS SUPPLEMENT",
      +                 "TANGUTCOMPONENTSSUPPLEMENT");
      + 
      +         /**
      +          * Constant for the "Miscellaneous Symbols Supplement" Unicode
      +          * character block.
      +          * @since 26
      +          */
      +         public static final UnicodeBlock MISCELLANEOUS_SYMBOLS_SUPPLEMENT =
      +             new UnicodeBlock("MISCELLANEOUS_SYMBOLS_SUPPLEMENT",
      +                 "MISCELLANEOUS SYMBOLS SUPPLEMENT",
      +                 "MISCELLANEOUSSYMBOLSSUPPLEMENT");
      + 
      +         /**
      +          * Constant for the "Tai Yo" Unicode
      +          * character block.
      +          * @since 26
      +          */
      +         public static final UnicodeBlock TAI_YO =
      +             new UnicodeBlock("TAI_YO",
      +                 "TAI YO",
      +                 "TAIYO");
      + 
      +         /**
      +          * Constant for the "CJK Unified Ideographs Extension J" Unicode
      +          * character block.
      +          * @since 26
      +          */
      +         public static final UnicodeBlock CJK_UNIFIED_IDEOGRAPHS_EXTENSION_J =
      +             new UnicodeBlock("CJK_UNIFIED_IDEOGRAPHS_EXTENSION_J",
      +                 "CJK UNIFIED IDEOGRAPHS EXTENSION J",
      +                 "CJKUNIFIEDIDEOGRAPHSEXTENSIONJ");

      In java.lang.Character.UnicodeScript class, add the following new fields:

      +         /**
      +          * Unicode script "Sidetic".
      +          * @since 26
      +          */
      +         SIDETIC,
      + 
      +         /**
      +          * Unicode script "Tolong Siki".
      +          * @since 26
      +          */
      +         TOLONG_SIKI,
      + 
      +         /**
      +          * Unicode script "Beria Erfe".
      +          * @since 26
      +          */
      +         BERIA_ERFE,
      + 
      +         /**
      +          * Unicode script "Tai Yo".
      +          * @since 26
      +          */
      +         TAI_YO,

            naoto Naoto Sato
            naoto Naoto Sato
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated: