Loading...

XML

Word

Printable

Type: CSR
Resolution: Approved
Priority: P4
Fix Version/s: 11.0.0.1
Component/s: core-libs
Labels:
- jsr384-mr2

Subcomponent:
java.lang
Compatibility Kind:

behavioral
Compatibility Risk:
minimal
Compatibility Risk Description:

Hide
The risk is minimal as this CSR simply *allows* those code points, keeping the existing code points intact. Also the change prohibits the new code points being the start/part of the Java identifiers so that the binary compatibility will be kept, as we did with the Japanese Era character addition.

Show
The risk is minimal as this CSR simply *allows* those code points, keeping the existing code points intact. Also the change prohibits the new code points being the start/part of the Java identifiers so that the binary compatibility will be kept, as we did with the Japanese Era character addition.
Interface Kind:

Java API
Scope:
SE

Summary

Allow additional code points to support GB18030-2022 from beyond Unicode 10 which Java SE 11 is based upon.

Problem

China National Standard body (CESI) has recently published GB18030-2022 which is an updated version of the GB18030 standard and brings GB18030 in sync with Unicode version 11.0. Since Java SE 11 supports characters defined in Unicode 10.0, some characters defined in the new GB18030 standard cannot be represented.

Solution

Allow code points that are required by the Implementation Level 1 definition in the GB18030-2022 standard. Additionally required code points are in the range of U+9FEB to U+9FEF, totaling 5 code points.

Specification

Modify the second paragraph in the Unicode Conformance section in the class description of java.lang.Character class as follows:

diff a/src/java.base/share/classes/java/lang/Character.java b/src/java.base/share/classes/java/lang/Character.java
--- a/src/java.base/share/classes/java/lang/Character.java
+++ b/src/java.base/share/classes/java/lang/Character.java
@@ -52,14 +52,18 @@
  * assigned Unicode code point or character range. The file is available
  * from the Unicode Consortium at
  * <a href="http://www.unicode.org">http://www.unicode.org</a>.
  * <p>
  * The Java SE 11 Platform uses character information from version 10.0
- * of the Unicode Standard, with an extension. The Java SE 11 Platform allows
- * an implementation of class {@code Character} to use the Japanese Era
- * code point, {@code U+32FF}, from the first version of the Unicode Standard
- * after 10.0 that assigns the code point. Consequently, the behavior of
+ * of the Unicode Standard, with two extensions. First, the Java SE 11 Platform
+ * allows an implementation of class {@code Character} to use the code points
+ * in the range of {@code U+9FEB} to {@code U+9FEF} from the Unicode Standard
+ * version 11.0, in order for the class to allow the "Implementation Level 1"
+ * of the Chinese GB18030-2022 standard. Second, the Java SE 11 Platform
+ * allows an implementation of class {@code Character} to use the Japanese Era
+ * code point, {@code U+32FF}, from the Unicode Standard version 12.1.
+ * Consequently, the behavior of
  * fields and methods of class {@code Character} may vary across
  * implementations of the Java SE 11 Platform when processing the
  * aforementioned code point ( outside of version 10.0 ), except for
  * the following methods that define Java identifiers:
  * {@link #isJavaIdentifierStart(int)}, {@link #isJavaIdentifierStart(char)},

csr of

JDK-8301401 Allow additional characters for GB18030-2022 support

Resolved

Assignee:: Naoto Sato

Reporter:: Naoto Sato

Reviewed By:: Alan Bateman, Lance Andersen

Votes:: 0 Vote for this issue

Watchers:: 7 Start watching this issue

Created:: 2023-01-31 09:53

Updated:: 2023-03-13 16:27

Resolved:: 2023-02-06 13:34

Details

Description

Summary

Problem

Solution

Specification

Attachments

Issue Links

Activity

People

Dates