Loading...

Type: Enhancement
Resolution: Fixed
Priority: P3
Fix Version/s: openjdk8u212
Affects Version/s: 8
Component/s: core-libs
Labels:

Subcomponent:
java.lang
Resolved In Build:
b01
CPU:

generic
OS:

generic

Issue	Fix Version	Assignee	Priority	Status	Resolution	Resolved In Build
JDK-8218713	8u222	Sean Coffey	P3	Resolved	Fixed	master
JDK-8218015	8u221	Sean Coffey	P3	Resolved	Fixed	b01
JDK-8219757	8u212	Sean Coffey	P3	Resolved	Fixed	b06
JDK-8219735	8u211	Sean Coffey	P3	Resolved	Fixed	b08
JDK-8224364	emb-8u221	Sean Coffey	P3	Resolved	Fixed	master
JDK-8221041	emb-8u211	Sean Coffey	P3	Resolved	Fixed	b08

The Java SE 8 Platform uses character code points from version 6.2 of the Unicode Standard. As a result, the new Japanese Era code point (U+32FF) which is expected to be assigned in version 12.1 of the Unicode Standard and new currency code points assigned in version 10.0 of the Unicode Standard are not available for use in Java 8.

Solution
----------------
Modify the specification of java.lang.Character to allow (though not require) implementations of the Java SE 8 Platform to support the new era code point and currency code points. In effect, the Java SE 8 Platform supports Unicode 6.2 plus an extension.

Consequently, the behavior of fields and methods of java.lang.Character may vary across implementations of the Java SE 8 Platform when processing U+32FF and currency code points U+20BB,U+20BC,U+20BD,U+20BE,U+20BF, except for the following methods that define Java identifiers:

isJavaIdentifierStart(int)
isJavaIdentifierStart(char)
isJavaIdentifierPart(int)
isJavaIdentifierPart(char)

Code points in Java identifiers must continue to be drawn from Unicode 6.2, for source compatibility reasons.
These changes necessitate a Maintenance Review of the Java SE 8 Platform. See the announcement [0] to the OpenJDK community.
[0] http://mail.openjdk.java.net/pipermail/jdk8u-dev/2018-December/008324.html

Specification
The initial portion of specification of the java.lang.Character class is changed from:

/**
* The {@code Character} class wraps a value of the primitive
* type {@code char} in an object. An object of type
* {@code Character} contains a single field whose type is
* {@code char}.
* <p>
* In addition, this class provides several methods for determining
* a character's category (lowercase letter, digit, etc.) and for converting
* characters from uppercase to lowercase and vice versa.
* <p>
* Character information is based on the Unicode Standard, version 6.2.0.
* <p>
* The methods and data of class {@code Character} are defined by
* the information in the <i>UnicodeData</i> file that is part of the
* Unicode Character Database maintained by the Unicode
* Consortium. This file specifies various properties including name
* and general category for every defined Unicode code point or
* character range.
* <p>
* The file and its description are available from the Unicode Consortium at:
* <ul>
* <li><a href="http://www.unicode.org">http://www.unicode.org</a>
* </ul>
*
* <h3><a name="unicode">Unicode Character Representations</a></h3>

to

/**
* The {@code Character} class wraps a value of the primitive
* type {@code char} in an object. An object of class
* {@code Character} contains a single field whose type is
* {@code char}.
* <p>
* In addition, this class provides a large number of static methods for
* determining a character's category (lowercase letter, digit, etc.)
* and for converting characters from uppercase to lowercase and vice
* versa.
*
* <h3><a id="conformance">Unicode Conformance</a></h3>
* <p>
* The fields and methods of class {@code Character} are defined in terms
* of character information from the Unicode Standard, specifically the
* <i>UnicodeData</i> file that is part of the Unicode Character Database.
* This file specifies properties including name and category for every
* assigned Unicode code point or character range. The file is available
* from the Unicode Consortium at
* <a href="http://www.unicode.org">http://www.unicode.org</a>.
* <p>
* The Java SE 8 Platform uses character information from version 6.2
* of the Unicode Standard, with two extensions. First, the Java SE 8 Platform
* allows an implementation of class {@code Character} to use the Japanese Era
* code point, {@code U+32FF}, from the first version of the Unicode Standard
* after 6.2 that assigns the code point. Second, in recognition of the fact
* that new currencies appear frequently, the Java SE 8 Platform allows an
* implementation of class {@code Character} to use the Currency Symbols
* block from version 10.0 of the Unicode Standard. Consequently, the
* behavior of fields and methods of class {@code Character} may vary across
* implementations of the Java SE 8 Platform when processing the aforementioned
* code points ( outside of version 6.2 ), except for the following methods
* that define Java identifiers:
* {@link #isJavaIdentifierStart(int)}, {@link #isJavaIdentifierStart(char)},
* {@link #isJavaIdentifierPart(int)}, and {@link #isJavaIdentifierPart(char)}.
* Code points in Java identifiers must be drawn from version 6.2 of
* the Unicode Standard.
*
* <h3><a name="unicode">Unicode Character Representations</a></h3>

The initial portion of specification of the isJavaLetter(char ch) method is changed from:

/**
* Determines if the specified character is permissible as the first
* character in a Java identifier.
* <p>
* A character may start a Java identifier if and only if
* one of the following is true:
* <ul>
* <li> {@link #isLetter(char) isLetter(ch)} returns {@code true}
* <li> {@link #getType(char) getType(ch)} returns {@code LETTER_NUMBER}
* <li> {@code ch} is a currency symbol (such as {@code '$'})
* <li> {@code ch} is a connecting punctuation character (such as {@code '_'}).
* </ul>
*

to

/**
* Determines if the specified character is permissible as the first
* character in a Java identifier.
* <p>
* A character may start a Java identifier if and only if
* one of the following conditions is true:
* <ul>
* <li> {@link #isLetter(char) isLetter(ch)} returns {@code true}
* <li> {@link #getType(char) getType(ch)} returns {@code LETTER_NUMBER}
* <li> {@code ch} is a currency symbol (such as {@code '$'})
* <li> {@code ch} is a connecting punctuation character (such as {@code '_'}).
* </ul>
*
* These conditions are tested against the character information from version
* 6.2 of the Unicode Standard.
*

The initial portion of specification of the isJavaLetterOrDigit(char ch) method is changed from:

/**
* Determines if the specified character may be part of a Java
* identifier as other than the first character.
* <p>
* A character may be part of a Java identifier if and only if any
* of the following are true:
* <ul>
* <li> it is a letter
* <li> it is a currency symbol (such as {@code '$'})
* <li> it is a connecting punctuation character (such as {@code '_'})
* <li> it is a digit
* <li> it is a numeric letter (such as a Roman numeral character)
* <li> it is a combining mark
* <li> it is a non-spacing mark
* <li> {@code isIdentifierIgnorable} returns
* {@code true} for the character.
* </ul>
*

to

/**
* Determines if the specified character may be part of a Java
* identifier as other than the first character.
* <p>
* A character may be part of a Java identifier if and only if any
* of the following conditions are true:
* <ul>
* <li> it is a letter
* <li> it is a currency symbol (such as {@code '$'})
* <li> it is a connecting punctuation character (such as {@code '_'})
* <li> it is a digit
* <li> it is a numeric letter (such as a Roman numeral character)
* <li> it is a combining mark
* <li> it is a non-spacing mark
* <li> {@code isIdentifierIgnorable} returns
* {@code true} for the character.
* </ul>
*
* These conditions are tested against the character information from version
* 6.2 of the Unicode Standard.
*

The initial portion of specification of the isJavaIdentifierStart(char ch) method is changed from:

/**
* Determines if the specified character is
* permissible as the first character in a Java identifier.
* <p>
* A character may start a Java identifier if and only if
* one of the following conditions is true:
* <ul>
* <li> {@link #isLetter(char) isLetter(ch)} returns {@code true}
* <li> {@link #getType(char) getType(ch)} returns {@code LETTER_NUMBER}
* <li> {@code ch} is a currency symbol (such as {@code '$'})
* <li> {@code ch} is a connecting punctuation character (such as {@code '_'}).
* </ul>
*

to

/**
* Determines if the specified character is
* permissible as the first character in a Java identifier.
* <p>
* A character may start a Java identifier if and only if
* one of the following conditions is true:
* <ul>
* <li> {@link #isLetter(char) isLetter(ch)} returns {@code true}
* <li> {@link #getType(char) getType(ch)} returns {@code LETTER_NUMBER}
* <li> {@code ch} is a currency symbol (such as {@code '$'})
* <li> {@code ch} is a connecting punctuation character (such as {@code '_'}).
* </ul>
*
* These conditions are tested against the character information from version
* 6.2 of the Unicode Standard.
*

The initial portion of specification of the isJavaIdentifierStart(int codePoint) method is changed from:

/**
* Determines if the character (Unicode code point) is
* permissible as the first character in a Java identifier.
* <p>
* A character may start a Java identifier if and only if
* one of the following conditions is true:
* <ul>
* <li> {@link #isLetter(int) isLetter(codePoint)}
*      returns {@code true}
* <li> {@link #getType(int) getType(codePoint)}
*      returns {@code LETTER_NUMBER}
* <li> the referenced character is a currency symbol (such as {@code '$'})
* <li> the referenced character is a connecting punctuation character
*      (such as {@code '_'}).
* </ul>
*

to

/**
* Determines if the character (Unicode code point) is
* permissible as the first character in a Java identifier.
* <p>
* A character may start a Java identifier if and only if
* one of the following conditions is true:
* <ul>
* <li> {@link #isLetter(int) isLetter(codePoint)}
*      returns {@code true}
* <li> {@link #getType(int) getType(codePoint)}
*      returns {@code LETTER_NUMBER}
* <li> the referenced character is a currency symbol (such as {@code '$'})
* <li> the referenced character is a connecting punctuation character
*      (such as {@code '_'}).
* </ul>
*
* These conditions are tested against the character information from version
* 6.2 of the Unicode Standard.
*

The initial portion of specification of the isJavaIdentifierPart(char ch) method is changed from:

/**
* Determines if the specified character may be part of a Java
* identifier as other than the first character.
* <p>
* A character may be part of a Java identifier if any of the following
* are true:
* <ul>
* <li> it is a letter
* <li> it is a currency symbol (such as {@code '$'})
* <li> it is a connecting punctuation character (such as {@code '_'})
* <li> it is a digit
* <li> it is a numeric letter (such as a Roman numeral character)
* <li> it is a combining mark
* <li> it is a non-spacing mark
* <li> {@code isIdentifierIgnorable} returns
* {@code true} for the character
* </ul>
*

to

/**
* Determines if the specified character may be part of a Java
* identifier as other than the first character.
* <p>
* A character may be part of a Java identifier if any of the following
* conditions are true:
* <ul>
* <li> it is a letter
* <li> it is a currency symbol (such as {@code '$'})
* <li> it is a connecting punctuation character (such as {@code '_'})
* <li> it is a digit
* <li> it is a numeric letter (such as a Roman numeral character)
* <li> it is a combining mark
* <li> it is a non-spacing mark
* <li> {@code isIdentifierIgnorable} returns
* {@code true} for the character
* </ul>
*
* These conditions are tested against the character information from version
* 6.2 of the Unicode Standard.
*

The initial portion of specification of the isJavaIdentifierPart(int codePoint) method is changed from:

/**
* Determines if the character (Unicode code point) may be part of a Java
* identifier as other than the first character.
* <p>
* A character may be part of a Java identifier if any of the following
* are true:
* <ul>
* <li> it is a letter
* <li> it is a currency symbol (such as {@code '$'})
* <li> it is a connecting punctuation character (such as {@code '_'})
* <li> it is a digit
* <li> it is a numeric letter (such as a Roman numeral character)
* <li> it is a combining mark
* <li> it is a non-spacing mark
* <li> {@link #isIdentifierIgnorable(int)
* isIdentifierIgnorable(codePoint)} returns {@code true} for
* the character
* </ul>
*

to

/**
* Determines if the character (Unicode code point) may be part of a Java
* identifier as other than the first character.
* <p>
* A character may be part of a Java identifier if any of the following
* conditions are true:
* <ul>
* <li> it is a letter
* <li> it is a currency symbol (such as {@code '$'})
* <li> it is a connecting punctuation character (such as {@code '_'})
* <li> it is a digit
* <li> it is a numeric letter (such as a Roman numeral character)
* <li> it is a combining mark
* <li> it is a non-spacing mark
* <li> {@link #isIdentifierIgnorable(int)
* isIdentifierIgnorable(codePoint)} returns {@code true} for
* the code point
* </ul>
*
* These conditions are tested against the character information from version
* 6.2 of the Unicode Standard.
*

backported by

JDK-8218015 Support new Japanese era and new currency code points in java.lang.Character for Java SE 8

Resolved

JDK-8218713 Support new Japanese era and new currency code points in java.lang.Character for Java SE 8

Resolved

JDK-8219735 Support new Japanese era and new currency code points in java.lang.Character for Java SE 8

Resolved

JDK-8219757 Support new Japanese era and new currency code points in java.lang.Character for Java SE 8

Resolved

JDK-8221041 Support new Japanese era and new currency code points in java.lang.Character for Java SE 8

Resolved

JDK-8224364 Support new Japanese era and new currency code points in java.lang.Character for Java SE 8

Resolved

is cloned by

JDK-8216546 Support new Japanese era in java.lang.Character for Java SE 11

Resolved

relates to

JDK-8211398 Square character support for the Japanese new era

Resolved

JDK-8217710 Add 5 currency code points to Java SE 8uX

Resolved

JDK-8215303 Allowing additional currency code points from later Unicode updates

Closed

(1 backported by, 1 is cloned by, 3 relates to)

Details

Backports

Description

Attachments

Issue Links

Activity

People

Dates