Loading...

XML

Word

Printable

Type: CSR
Resolution: Approved
Priority: P3
Fix Version/s: 19
Component/s: core-libs
Labels:
None

Subcomponent:
java.lang
Compatibility Kind:

source
Compatibility Risk:
low
Compatibility Risk Description:
Unicode keeps backward compatible, thus JDK adopting it would not expect any backward compatibility issues.
Interface Kind:

Java API
Scope:
SE

Summary

Support the Unicode Standard version 14.0.0 in the JDK.

Problem

Keeping up with the latest Unicode Standard is imperative. Otherwise, interoperability with other platforms would be problematic.

Solution

Incorporate Unicode 14.0 that assigned 838 characters, 12 new blocks, and 5 new scripts since Unicode 13.0. Detailed changes are described in the Unicode Consortium's 14.0 website.

java.text.Bidi and java.text.Normalizer classes will be upgraded to 14.0 level of Unicode Annex #9 and #15, respectively.

Support for the Unicode extended grapheme clusters in java.util.regex.Pattern will be upgraded to 14.0 level of the Unicode Annex #29 "Unicode Text Segmentation."

For more specific delta charts, refer to Unicode.org's delta page

Specification

Change the class description in the java.lang.Character class as:

@@ -61,11 +61,11 @@
  * This file specifies properties including name and category for every
  * assigned Unicode code point or character range. The file is available
  * from the Unicode Consortium at
  * <a href="http://www.unicode.org">http://www.unicode.org</a>.
  * <p>
- * Character information is based on the Unicode Standard, version 13.0.
+ * Character information is based on the Unicode Standard, version 14.0.
  * <p>
  * The Java platform has supported different versions of the Unicode
  * Standard over time. Upgrades to newer versions of the Unicode Standard
  * occurred in the following Java releases, each indicating the new version:
  * <table class="striped">
@@ -73,10 +73,12 @@
  * <thead>
  * <tr><th scope="col">Java release</th>
  *     <th scope="col">Unicode version</th></tr>
  * </thead>
  * <tbody>
+ * <tr><th scope="row" style="text-align:left">Java SE 19</th>
+ *     <td>Unicode 14.0</td></tr>
  * <tr><th scope="row" style="text-align:left">Java SE 15</th>
  *     <td>Unicode 13.0</td></tr>
  * <tr><th scope="row" style="text-align:left">Java SE 13</th>
  *     <td>Unicode 12.1</td></tr>
  * <tr><th scope="row" style="text-align:left">Java SE 12</th>

In java.lang.Character.UnicodeBlock class, add the following new fields:

 /**
  * Constant for the "Arabic Extended-B" Unicode
  * character block.
  * @since 19
  */
 public static final UnicodeBlock ARABIC_EXTENDED_B

 /**
  * Constant for the "Vithkuqi" Unicode
  * character block.
  * @since 19
  */
 public static final UnicodeBlock VITHKUQI

 /**
  * Constant for the "Latin Extended-F" Unicode
  * character block.
  * @since 19
  */
 public static final UnicodeBlock LATIN_EXTENDED_F

 /**
  * Constant for the "Old Uyghur" Unicode
  * character block.
  * @since 19
  */
 public static final UnicodeBlock OLD_UYGHUR

 /**
  * Constant for the "Unified Canadian Aboriginal Syllabics Extended-A" Unicode
  * character block.
  * @since 19
  */
 public static final UnicodeBlock UNIFIED_CANADIAN_ABORIGINAL_SYLLABICS_EXTENDED_A

 /**
  * Constant for the "Cypro-Minoan" Unicode
  * character block.
  * @since 19
  */
 public static final UnicodeBlock CYPRO_MINOAN

 /**
  * Constant for the "Tangsa" Unicode
  * character block.
  * @since 19
  */
 public static final UnicodeBlock TANGSA

 /**
  * Constant for the "Kana Extended-B" Unicode
  * character block.
  * @since 19
  */
 public static final UnicodeBlock KANA_EXTENDED_B

 /**
  * Constant for the "Znamenny Musical Notation" Unicode
  * character block.
  * @since 19
  */
 public static final UnicodeBlock ZNAMENNY_MUSICAL_NOTATION

 /**
  * Constant for the "Latin Extended-G" Unicode
  * character block.
  * @since 19
  */
 public static final UnicodeBlock LATIN_EXTENDED_G

 /**
  * Constant for the "Toto" Unicode
  * character block.
  * @since 19
  */
 public static final UnicodeBlock TOTO

 /**
  * Constant for the "Ethiopic Extended-B" Unicode
  * character block.
  * @since 19
  */
 public static final UnicodeBlock ETHIOPIC_EXTENDED_B

In java.lang.Character.UnicodeScript enum, add the following new fields:

 /**
  * Unicode script "Vithkuqi".
  * @since 19
  */
 VITHKUQI,

 /**
  * Unicode script "Old Uyghur".
  * @since 19
  */
 OLD_UYGHUR,

 /**
  * Unicode script "Cypro Minoan".
  * @since 19
  */
 CYPRO_MINOAN,

 /**
  * Unicode script "Tangsa".
  * @since 19
  */
 TANGSA,

 /**
  * Unicode script "Toto".
  * @since 19
  */
 TOTO

csr of

JDK-8268081 Update Unicode Data Files to 14.0.0

Resolved

Assignee:: Naoto Sato

Reporter:: Naoto Sato

Reviewed By:: Joe Wang

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Created:: 2022-01-05 13:24

Updated:: 2022-01-11 21:59

Resolved:: 2022-01-11 21:59

Details

Description

Summary

Problem

Solution

Specification

Attachments

Issue Links

Activity

People

Dates