-
Enhancement
-
Resolution: Fixed
-
P3
-
8
-
b01
-
generic
-
generic
Issue | Fix Version | Assignee | Priority | Status | Resolution | Resolved In Build |
---|---|---|---|---|---|---|
JDK-8218713 | 8u222 | Sean Coffey | P3 | Resolved | Fixed | master |
JDK-8218015 | 8u221 | Sean Coffey | P3 | Resolved | Fixed | b01 |
JDK-8219757 | 8u212 | Sean Coffey | P3 | Resolved | Fixed | b06 |
JDK-8219735 | 8u211 | Sean Coffey | P3 | Resolved | Fixed | b08 |
JDK-8224364 | emb-8u221 | Sean Coffey | P3 | Resolved | Fixed | master |
JDK-8221041 | emb-8u211 | Sean Coffey | P3 | Resolved | Fixed | b08 |
Solution
----------------
Modify the specification of java.lang.Character to allow (though not require) implementations of the Java SE 8 Platform to support the new era code point and currency code points. In effect, the Java SE 8 Platform supports Unicode 6.2 plus an extension.
Consequently, the behavior of fields and methods of java.lang.Character may vary across implementations of the Java SE 8 Platform when processing U+32FF and currency code points U+20BB,U+20BC,U+20BD,U+20BE,U+20BF, except for the following methods that define Java identifiers:
isJavaIdentifierStart(int)
isJavaIdentifierStart(char)
isJavaIdentifierPart(int)
isJavaIdentifierPart(char)
Code points in Java identifiers must continue to be drawn from Unicode 6.2, for source compatibility reasons.
These changes necessitate a Maintenance Review of the Java SE 8 Platform. See the announcement [0] to the OpenJDK community.
[0] http://mail.openjdk.java.net/pipermail/jdk8u-dev/2018-December/008324.html
Specification
The initial portion of specification of the java.lang.Character class is changed from:
/**
* The {@code Character} class wraps a value of the primitive
* type {@code char} in an object. An object of type
* {@code Character} contains a single field whose type is
* {@code char}.
* <p>
* In addition, this class provides several methods for determining
* a character's category (lowercase letter, digit, etc.) and for converting
* characters from uppercase to lowercase and vice versa.
* <p>
* Character information is based on the Unicode Standard, version 6.2.0.
* <p>
* The methods and data of class {@code Character} are defined by
* the information in the <i>UnicodeData</i> file that is part of the
* Unicode Character Database maintained by the Unicode
* Consortium. This file specifies various properties including name
* and general category for every defined Unicode code point or
* character range.
* <p>
* The file and its description are available from the Unicode Consortium at:
* <ul>
* <li><a href="http://www.unicode.org">http://www.unicode.org</a>
* </ul>
*
* <h3><a name="unicode">Unicode Character Representations</a></h3>
to
/**
* The {@code Character} class wraps a value of the primitive
* type {@code char} in an object. An object of class
* {@code Character} contains a single field whose type is
* {@code char}.
* <p>
* In addition, this class provides a large number of static methods for
* determining a character's category (lowercase letter, digit, etc.)
* and for converting characters from uppercase to lowercase and vice
* versa.
*
* <h3><a id="conformance">Unicode Conformance</a></h3>
* <p>
* The fields and methods of class {@code Character} are defined in terms
* of character information from the Unicode Standard, specifically the
* <i>UnicodeData</i> file that is part of the Unicode Character Database.
* This file specifies properties including name and category for every
* assigned Unicode code point or character range. The file is available
* from the Unicode Consortium at
* <a href="http://www.unicode.org">http://www.unicode.org</a>.
* <p>
* The Java SE 8 Platform uses character information from version 6.2
* of the Unicode Standard, with two extensions. First, the Java SE 8 Platform
* allows an implementation of class {@code Character} to use the Japanese Era
* code point, {@code U+32FF}, from the first version of the Unicode Standard
* after 6.2 that assigns the code point. Second, in recognition of the fact
* that new currencies appear frequently, the Java SE 8 Platform allows an
* implementation of class {@code Character} to use the Currency Symbols
* block from version 10.0 of the Unicode Standard. Consequently, the
* behavior of fields and methods of class {@code Character} may vary across
* implementations of the Java SE 8 Platform when processing the aforementioned
* code points ( outside of version 6.2 ), except for the following methods
* that define Java identifiers:
* {@link #isJavaIdentifierStart(int)}, {@link #isJavaIdentifierStart(char)},
* {@link #isJavaIdentifierPart(int)}, and {@link #isJavaIdentifierPart(char)}.
* Code points in Java identifiers must be drawn from version 6.2 of
* the Unicode Standard.
*
* <h3><a name="unicode">Unicode Character Representations</a></h3>
The initial portion of specification of the isJavaLetter(char ch) method is changed from:
/**
* Determines if the specified character is permissible as the first
* character in a Java identifier.
* <p>
* A character may start a Java identifier if and only if
* one of the following is true:
* <ul>
* <li> {@link #isLetter(char) isLetter(ch)} returns {@code true}
* <li> {@link #getType(char) getType(ch)} returns {@code LETTER_NUMBER}
* <li> {@code ch} is a currency symbol (such as {@code '$'})
* <li> {@code ch} is a connecting punctuation character (such as {@code '_'}).
* </ul>
*
to
/**
* Determines if the specified character is permissible as the first
* character in a Java identifier.
* <p>
* A character may start a Java identifier if and only if
* one of the following conditions is true:
* <ul>
* <li> {@link #isLetter(char) isLetter(ch)} returns {@code true}
* <li> {@link #getType(char) getType(ch)} returns {@code LETTER_NUMBER}
* <li> {@code ch} is a currency symbol (such as {@code '$'})
* <li> {@code ch} is a connecting punctuation character (such as {@code '_'}).
* </ul>
*
* These conditions are tested against the character information from version
* 6.2 of the Unicode Standard.
*
The initial portion of specification of the isJavaLetterOrDigit(char ch) method is changed from:
/**
* Determines if the specified character may be part of a Java
* identifier as other than the first character.
* <p>
* A character may be part of a Java identifier if and only if any
* of the following are true:
* <ul>
* <li> it is a letter
* <li> it is a currency symbol (such as {@code '$'})
* <li> it is a connecting punctuation character (such as {@code '_'})
* <li> it is a digit
* <li> it is a numeric letter (such as a Roman numeral character)
* <li> it is a combining mark
* <li> it is a non-spacing mark
* <li> {@code isIdentifierIgnorable} returns
* {@code true} for the character.
* </ul>
*
to
/**
* Determines if the specified character may be part of a Java
* identifier as other than the first character.
* <p>
* A character may be part of a Java identifier if and only if any
* of the following conditions are true:
* <ul>
* <li> it is a letter
* <li> it is a currency symbol (such as {@code '$'})
* <li> it is a connecting punctuation character (such as {@code '_'})
* <li> it is a digit
* <li> it is a numeric letter (such as a Roman numeral character)
* <li> it is a combining mark
* <li> it is a non-spacing mark
* <li> {@code isIdentifierIgnorable} returns
* {@code true} for the character.
* </ul>
*
* These conditions are tested against the character information from version
* 6.2 of the Unicode Standard.
*
The initial portion of specification of the isJavaIdentifierStart(char ch) method is changed from:
/**
* Determines if the specified character is
* permissible as the first character in a Java identifier.
* <p>
* A character may start a Java identifier if and only if
* one of the following conditions is true:
* <ul>
* <li> {@link #isLetter(char) isLetter(ch)} returns {@code true}
* <li> {@link #getType(char) getType(ch)} returns {@code LETTER_NUMBER}
* <li> {@code ch} is a currency symbol (such as {@code '$'})
* <li> {@code ch} is a connecting punctuation character (such as {@code '_'}).
* </ul>
*
to
/**
* Determines if the specified character is
* permissible as the first character in a Java identifier.
* <p>
* A character may start a Java identifier if and only if
* one of the following conditions is true:
* <ul>
* <li> {@link #isLetter(char) isLetter(ch)} returns {@code true}
* <li> {@link #getType(char) getType(ch)} returns {@code LETTER_NUMBER}
* <li> {@code ch} is a currency symbol (such as {@code '$'})
* <li> {@code ch} is a connecting punctuation character (such as {@code '_'}).
* </ul>
*
* These conditions are tested against the character information from version
* 6.2 of the Unicode Standard.
*
The initial portion of specification of the isJavaIdentifierStart(int codePoint) method is changed from:
/**
* Determines if the character (Unicode code point) is
* permissible as the first character in a Java identifier.
* <p>
* A character may start a Java identifier if and only if
* one of the following conditions is true:
* <ul>
* <li> {@link #isLetter(int) isLetter(codePoint)}
* returns {@code true}
* <li> {@link #getType(int) getType(codePoint)}
* returns {@code LETTER_NUMBER}
* <li> the referenced character is a currency symbol (such as {@code '$'})
* <li> the referenced character is a connecting punctuation character
* (such as {@code '_'}).
* </ul>
*
to
/**
* Determines if the character (Unicode code point) is
* permissible as the first character in a Java identifier.
* <p>
* A character may start a Java identifier if and only if
* one of the following conditions is true:
* <ul>
* <li> {@link #isLetter(int) isLetter(codePoint)}
* returns {@code true}
* <li> {@link #getType(int) getType(codePoint)}
* returns {@code LETTER_NUMBER}
* <li> the referenced character is a currency symbol (such as {@code '$'})
* <li> the referenced character is a connecting punctuation character
* (such as {@code '_'}).
* </ul>
*
* These conditions are tested against the character information from version
* 6.2 of the Unicode Standard.
*
The initial portion of specification of the isJavaIdentifierPart(char ch) method is changed from:
/**
* Determines if the specified character may be part of a Java
* identifier as other than the first character.
* <p>
* A character may be part of a Java identifier if any of the following
* are true:
* <ul>
* <li> it is a letter
* <li> it is a currency symbol (such as {@code '$'})
* <li> it is a connecting punctuation character (such as {@code '_'})
* <li> it is a digit
* <li> it is a numeric letter (such as a Roman numeral character)
* <li> it is a combining mark
* <li> it is a non-spacing mark
* <li> {@code isIdentifierIgnorable} returns
* {@code true} for the character
* </ul>
*
to
/**
* Determines if the specified character may be part of a Java
* identifier as other than the first character.
* <p>
* A character may be part of a Java identifier if any of the following
* conditions are true:
* <ul>
* <li> it is a letter
* <li> it is a currency symbol (such as {@code '$'})
* <li> it is a connecting punctuation character (such as {@code '_'})
* <li> it is a digit
* <li> it is a numeric letter (such as a Roman numeral character)
* <li> it is a combining mark
* <li> it is a non-spacing mark
* <li> {@code isIdentifierIgnorable} returns
* {@code true} for the character
* </ul>
*
* These conditions are tested against the character information from version
* 6.2 of the Unicode Standard.
*
The initial portion of specification of the isJavaIdentifierPart(int codePoint) method is changed from:
/**
* Determines if the character (Unicode code point) may be part of a Java
* identifier as other than the first character.
* <p>
* A character may be part of a Java identifier if any of the following
* are true:
* <ul>
* <li> it is a letter
* <li> it is a currency symbol (such as {@code '$'})
* <li> it is a connecting punctuation character (such as {@code '_'})
* <li> it is a digit
* <li> it is a numeric letter (such as a Roman numeral character)
* <li> it is a combining mark
* <li> it is a non-spacing mark
* <li> {@link #isIdentifierIgnorable(int)
* isIdentifierIgnorable(codePoint)} returns {@code true} for
* the character
* </ul>
*
to
/**
* Determines if the character (Unicode code point) may be part of a Java
* identifier as other than the first character.
* <p>
* A character may be part of a Java identifier if any of the following
* conditions are true:
* <ul>
* <li> it is a letter
* <li> it is a currency symbol (such as {@code '$'})
* <li> it is a connecting punctuation character (such as {@code '_'})
* <li> it is a digit
* <li> it is a numeric letter (such as a Roman numeral character)
* <li> it is a combining mark
* <li> it is a non-spacing mark
* <li> {@link #isIdentifierIgnorable(int)
* isIdentifierIgnorable(codePoint)} returns {@code true} for
* the code point
* </ul>
*
* These conditions are tested against the character information from version
* 6.2 of the Unicode Standard.
*
- backported by
-
JDK-8218015 Support new Japanese era and new currency code points in java.lang.Character for Java SE 8
- Resolved
-
JDK-8218713 Support new Japanese era and new currency code points in java.lang.Character for Java SE 8
- Resolved
-
JDK-8219735 Support new Japanese era and new currency code points in java.lang.Character for Java SE 8
- Resolved
-
JDK-8219757 Support new Japanese era and new currency code points in java.lang.Character for Java SE 8
- Resolved
-
JDK-8221041 Support new Japanese era and new currency code points in java.lang.Character for Java SE 8
- Resolved
-
JDK-8224364 Support new Japanese era and new currency code points in java.lang.Character for Java SE 8
- Resolved
- is cloned by
-
JDK-8216546 Support new Japanese era in java.lang.Character for Java SE 11
- Resolved
- relates to
-
JDK-8211398 Square character support for the Japanese new era
- Resolved
-
JDK-8217710 Add 5 currency code points to Java SE 8uX
- Resolved
-
JDK-8215303 Allowing additional currency code points from later Unicode updates
- Closed