-
CSR
-
Resolution: Approved
-
P4
-
behavioral
-
minimal
-
-
Java API
-
SE
Summary
Allow additional code points to support GB18030-2022 from beyond Unicode 6.2 which Java SE 8 is based upon.
Problem
China National Standard body (CESI) has recently published GB18030-2022 which is an updated version of the GB18030 standard and brings GB18030 in sync with Unicode version 11.0. Since Java SE 8 supports characters defined in Unicode 6.2, some characters defined in the new GB18030 standard cannot be represented.
Solution
Allow code points that are required by the Implementation Level 1
definition in the GB18030-2022 standard. Additionally required code points are in the range of U+9FCD
to U+9FEF
, totaling 35 code points.
Specification
Modify the second paragraph in the Unicode Conformance
section in the class description of java.lang.Character
class as follows:
diff a/jdk/src/share/classes/java/lang/Character.java b/jdk/src/share/classes/java/lang/Character.java
--- a/jdk/src/share/classes/java/lang/Character.java
+++ b/jdk/src/share/classes/java/lang/Character.java
@@ -50,17 +50,21 @@
* assigned Unicode code point or character range. The file is available
* from the Unicode Consortium at
* <a href="http://www.unicode.org">http://www.unicode.org</a>.
* <p>
* The Java SE 8 Platform uses character information from version 6.2
- * of the Unicode Standard, with two extensions. First, the Java SE 8 Platform
- * allows an implementation of class {@code Character} to use the Japanese Era
- * code point, {@code U+32FF}, from the first version of the Unicode Standard
- * after 6.2 that assigns the code point. Second, in recognition of the fact
+ * of the Unicode Standard, with three extensions. First, in recognition of the fact
* that new currencies appear frequently, the Java SE 8 Platform allows an
* implementation of class {@code Character} to use the Currency Symbols
- * block from version 10.0 of the Unicode Standard. Consequently, the
+ * block from version 10.0 of the Unicode Standard. Second, the Java SE 8 Platform
+ * allows an implementation of class {@code Character} to use the code points
+ * in the range of {@code U+9FCD} to {@code U+9FEF} from version 11.0 of the
+ * Unicode Standard, in order for the class to allow the "Implementation
+ * Level 1" of the Chinese GB18030-2022 standard. Third, the Java SE 8 Platform
+ * allows an implementation of class {@code Character} to use the Japanese Era
+ * code point, {@code U+32FF}, from the Unicode Standard version 12.1.
+ * Consequently, the
* behavior of fields and methods of class {@code Character} may vary across
* implementations of the Java SE 8 Platform when processing the aforementioned
* code points ( outside of version 6.2 ), except for the following methods
* that define Java identifiers:
* {@link #isJavaIdentifierStart(int)}, {@link #isJavaIdentifierStart(char)},
- csr of
-
JDK-8301400 Allow additional characters for GB18030-2022 support
- Resolved