Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8222771

Support for Unicode 12.1

    XMLWordPrintable

Details

    • CSR
    • Resolution: Approved
    • P4
    • 13
    • core-libs
    • None
    • behavioral
    • low
    • Unicode Character Database is an evolving standard. The behavior of methods in j.l.Character class may behave differently from the prior releases.
    • Java API
    • SE

    Description

      Summary

      Support the Unicode version 12.1 in the JDK.

      Problem

      Characters which have been assigned since Unicode 11.0 cannot be used in the JDK.

      Solution

      Incorporate Unicode 12.1 that assigned 555 characters and 4 new scripts since Unicode 11.0. Detailed changes are described in the Unicode Consortium's websites 12.0 and 12.1

      java.text.Bidi and java.text.Normalizer classes will be upgraded to 12.0 level of Unicode Annex #9 and #15, respectively.

      Support for the Unicode extended grapheme clusters in java.util.regex.Pattern will be upgraded. It employs the Unicode Annex #29 "Unicode Text Segmentation" which will be upgraded from version 8.0 to 12.0.

      Specification

      Change the following paragraph in java.lang.Character class' description from:

      * <p>
      * The Java SE 13 Platform uses character information from version 11.0
      * of the Unicode Standard, plus the Japanese Era code point,
      * {@code U+32FF}, from the first version of the Unicode Standard
      * after 11.0 that assigns the code point.

      to:

      * <p>
      * Character information is based on the Unicode Standard, version 12.1.

      Add the following new java.lang.Character.UnicodeScript enum constants:

      /**
       * Unicode script "Elymaic".
       * @since 13  
       */
      ELYMAIC,
      
      /**
       * Unicode script "Nandinagari".
       * @since 13  
       */
      NANDINAGARI,
      
      /**
       * Unicode script "Nyiakeng Puachue Hmong".
       * @since 13  
       */
      NYIAKENG_PUACHUE_HMONG,
      
      /**
       * Unicode script "Wancho".
       * @since 13  
       */
      WANCHO,

      Add the following new java.lang.Character.UnicodeBlock fields:

      /**
       * Constant for the "Elymaic" Unicode
       * character block.
       * @since 13
       */
      public static final UnicodeBlock ELYMAIC
      
      /**
       * Constant for the "Nandinagari" Unicode
       * character block.
       * @since 13
       */
      public static final UnicodeBlock NANDINAGARI 
      
      /**
       * Constant for the "Tamil Supplement" Unicode
       * character block.               
       * @since 13
       */
      public static final UnicodeBlock TAMIL_SUPPLEMENT 
      
      /**
       * Constant for the "Egyptian Hieroglyph Format Controls" Unicode
       * character block.               
       * @since 13
       */
      public static final UnicodeBlock EGYPTIAN_HIEROGLYPH_FORMAT_CONTROLS 
      
      /**
       * Constant for the "Small Kana Extension" Unicode
       * character block.
       * @since 13
       */
      public static final UnicodeBlock SMALL_KANA_EXTENSION 
      
      /**
       * Constant for the "Nyiakeng Puachue Hmong" Unicode
       * character block.
       * @since 13
       */
      public static final UnicodeBlock NYIAKENG_PUACHUE_HMONG 
      
      /**
       * Constant for the "Wancho" Unicode
       * character block.
       * @since 13
       */
      public static final UnicodeBlock WANCHO 
      
      /**
       * Constant for the "Ottoman Siyaq Numbers" Unicode
       * character block.
       * @since 13
       */
      public static final UnicodeBlock OTTOMAN_SIYAQ_NUMBERS 
      
      /**
       * Constant for the "Symbols and Pictographs Extended-A" Unicode
       * character block.
       * @since 13
       */
      public static final UnicodeBlock SYMBOLS_AND_PICTOGRAPHS_EXTENDED_A

      Change the following paragraph in java.util.regex.Pattern class description from:

       * <p> This class is in conformance with Level 1 of <a
       * href="http://www.unicode.org/reports/tr18/"><i>Unicode Technical
       * Standard #18: Unicode Regular Expression</i></a>, plus RL2.1
       * Canonical Equivalents.

      To:

       * <p> This class is in conformance with Level 1 of <a
       * href="http://www.unicode.org/reports/tr18/"><i>Unicode Technical
       * Standard #18: Unicode Regular Expression</i></a>, plus RL2.1
       * Canonical Equivalents and RL2.2 Extended Grapheme Clusters.

      Attachments

        Issue Links

          Activity

            People

              naoto Naoto Sato
              naoto Naoto Sato
              Roger Riggs
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: