-
CSR
-
Resolution: Approved
-
P4
-
behavioral
-
minimal
-
-
Java API
-
SE
Summary
Allow additional code points to support GB18030-2022 from beyond Unicode 10 which Java SE 11 is based upon.
Problem
China National Standard body (CESI) has recently published GB18030-2022 which is an updated version of the GB18030 standard and brings GB18030 in sync with Unicode version 11.0. Since Java SE 11 supports characters defined in Unicode 10.0, some characters defined in the new GB18030 standard cannot be represented.
Solution
Allow code points that are required by the Implementation Level 1
definition in the GB18030-2022 standard. Additionally required code points are in the range of U+9FEB
to U+9FEF
, totaling 5 code points.
Specification
Modify the second paragraph in the Unicode Conformance
section in the class description of java.lang.Character
class as follows:
diff a/src/java.base/share/classes/java/lang/Character.java b/src/java.base/share/classes/java/lang/Character.java
--- a/src/java.base/share/classes/java/lang/Character.java
+++ b/src/java.base/share/classes/java/lang/Character.java
@@ -52,14 +52,18 @@
* assigned Unicode code point or character range. The file is available
* from the Unicode Consortium at
* <a href="http://www.unicode.org">http://www.unicode.org</a>.
* <p>
* The Java SE 11 Platform uses character information from version 10.0
- * of the Unicode Standard, with an extension. The Java SE 11 Platform allows
- * an implementation of class {@code Character} to use the Japanese Era
- * code point, {@code U+32FF}, from the first version of the Unicode Standard
- * after 10.0 that assigns the code point. Consequently, the behavior of
+ * of the Unicode Standard, with two extensions. First, the Java SE 11 Platform
+ * allows an implementation of class {@code Character} to use the code points
+ * in the range of {@code U+9FEB} to {@code U+9FEF} from the Unicode Standard
+ * version 11.0, in order for the class to allow the "Implementation Level 1"
+ * of the Chinese GB18030-2022 standard. Second, the Java SE 11 Platform
+ * allows an implementation of class {@code Character} to use the Japanese Era
+ * code point, {@code U+32FF}, from the Unicode Standard version 12.1.
+ * Consequently, the behavior of
* fields and methods of class {@code Character} may vary across
* implementations of the Java SE 11 Platform when processing the
* aforementioned code point ( outside of version 10.0 ), except for
* the following methods that define Java identifiers:
* {@link #isJavaIdentifierStart(int)}, {@link #isJavaIdentifierStart(char)},
- csr of
-
JDK-8301401 Allow additional characters for GB18030-2022 support
- Resolved