Loading...

Type: CSR
Resolution: Approved
Priority: P2
Fix Version/s: 12
Component/s: core-libs
Labels:
None

Subcomponent:
java.lang
Compatibility Kind:

behavioral
Compatibility Risk:
low
Compatibility Risk Description:
Programs that rely on the behavior of Java SE implementations before 12, which generally do not recognize the new Japanese era code point, may behave differently on Java SE 12 implementations.
Interface Kind:

Java API
Scope:
SE

Summary

Mandate support in Java SE 12 for the new Japanese era code point.

Problem

The Java SE 12 Platform supports Unicode 11.0 (JDK-8212120) but must also support the new Japanese era code point which Unicode 11.0 does not include. The Java SE 11 Platform was updated to support the new code point at the discretion of the implementation (JDK-8216594), but it is appropriate for the Java SE 12 Platform to mandate support for the new code point in all Java SE 12 implementations.

Solution

Mandate support for the new Japanese era code point in all Java SE 12 implementations. This will make every method in the Character class recognize the code point and behave consistently, which will improve the maintainability of all Java SE 12 implementations. (In contrast, the Java SE 11 Platform disallowed the code point in the isJavaIdentifierStart/Part methods, which made the implementations of those methods harder to maintain.)

Specification

Change the first half (before "Unicode Character Representations" header) of the Character class specification from:

  * The {@code Character} class wraps a value of the primitive
  * type {@code char} in an object. An object of type
  * {@code Character} contains a single field whose type is
  * {@code char}.
  * <p>
  * In addition, this class provides several methods for determining
  * a character's category (lowercase letter, digit, etc.) and for converting
  * characters from uppercase to lowercase and vice versa.
  * <p>
  * Character information is based on the Unicode Standard, version 11.0.0.
  * <p>
  * The methods and data of class {@code Character} are defined by
  * the information in the <i>UnicodeData</i> file that is part of the
  * Unicode Character Database maintained by the Unicode
  * Consortium. This file specifies various properties including name
  * and general category for every defined Unicode code point or
  * character range.
  * <p>
  * The file and its description are available from the Unicode Consortium at:
  * <ul>
  * <li><a href="http://www.unicode.org">http://www.unicode.org</a>
  * </ul>
  * <p>
  * The code point, U+32FF, is reserved by the Unicode Consortium
  * to represent the Japanese square character for the new era that begins
  * May 2019. Relevant methods in the Character class return the same
  * properties as for the existing Japanese era characters (e.g., U+337E for
  * "Meizi"). For the details of the code point, refer to
  * <a href="http://blog.unicode.org/2018/09/new-japanese-era.html">
  * http://blog.unicode.org/2018/09/new-japanese-era.html</a>.

to:

 * The {@code Character} class wraps a value of the primitive
 * type {@code char} in an object. An object of class
 * {@code Character} contains a single field whose type is
 * {@code char}.
 * <p>
 * In addition, this class provides a large number of static methods for
 * determining a character's category (lowercase letter, digit, etc.)
 * and for converting characters from uppercase to lowercase and vice
 * versa.
 *
 * <h3><a id="conformance">Unicode Conformance</a></h3>
 * <p>
 * The fields and methods of class {@code Character} are defined in terms
 * of character information from the Unicode Standard, specifically the
 * <i>UnicodeData</i> file that is part of the Unicode Character Database.
 * This file specifies properties including name and category for every
 * assigned Unicode code point or character range. The file is available
 * from the Unicode Consortium at
 * <a href="http://www.unicode.org">http://www.unicode.org</a>.
 * <p> 
 * The Java SE 12 Platform uses character information from version 11.0
 * of the Unicode Standard, plus the Japanese Era code point,
 * {@code U+32FF}, from the first version of the Unicode Standard
 * after 11.0 that assigns the code point.

Change the second paragraph of isJavaIdentifierPart(char) and isJavaIdentifierPart(int) method description from:

     * A character may be part of a Java identifier if any of the following
     * are true:

to:

     * A character may be part of a Java identifier if any of the following
     * conditions are true:

Change the last list item of conditions in isJavaIdentifierPart(int) method description from:

     * <li> {@link #isIdentifierIgnorable(int)
     * isIdentifierIgnorable(codePoint)} returns {@code true} for
     * the character

to:

     * <li> {@link #isIdentifierIgnorable(int)
     * isIdentifierIgnorable(codePoint)} returns {@code true} for
     * the code point

Change the second paragraph of isJavaLetter(char) method description from:

     * A character may start a Java identifier if and only if
     * one of the following is true:

to:

     * A character may start a Java identifier if and only if
     * one of the following conditions is true:

Change the second paragraph of isJavaLetterOrDigit(char) method description from:

     * A character may be part of a Java identifier if and only if any
     * of the following are true:

to:

     * A character may be part of a Java identifier if and only if one
     * of the following conditions is true:

csr of

JDK-8217893 Support new Japanese era in java.lang.Character for Java SE 11

Resolved

relates to

JDK-8212120 Unicode 11.0.0

Closed

JDK-8216594 Support new Japanese era in java.lang.Character

Closed

JDK-8213055 Define Japanese new Era character

Closed

JDK-8218915 Change isJavaIdentifierStart and isJavaIdentifierPart to handle new code points

Resolved

JDK-8217939 Clarify support for the new Japanese era in java.time.chrono.JapaneseEra

Closed

(1 relates to)

Details

Description

Summary

Problem

Solution

Specification

Attachments

Issue Links

Activity

People

Dates