Requested by people at Unicode. In particular, `java.lang.Character.isUnicodeIdentifierStart/Part()` return `true` if :-
1) in UAX 31, it remove some characters by
-\p{Pattern_Syntax}-\p{Pattern_White_Space}
2) in JDK, it added isIdentifierIgnorable
ICU which is the reference implementation of Unicode specs recently strictly follows the UAX, which means removing some characters from the previous definition/implementation. In JDK, the spec was revised to the latest (at that point) with https://bugs.openjdk.org/browse/JDK-8229831 providing backward compatibility. We might consider aligning our spec/impl to UAX 31 strictly aligning to ICU as the Unicode RI.
1) in UAX 31, it remove some characters by
-\p{Pattern_Syntax}-\p{Pattern_White_Space}
2) in JDK, it added isIdentifierIgnorable
ICU which is the reference implementation of Unicode specs recently strictly follows the UAX, which means removing some characters from the previous definition/implementation. In JDK, the spec was revised to the latest (at that point) with https://bugs.openjdk.org/browse/JDK-8229831 providing backward compatibility. We might consider aligning our spec/impl to UAX 31 strictly aligning to ICU as the Unicode RI.
- relates to
-
JDK-8229831 Upgrade Character.isUnicodeIdentifierStart/Part() methods to the latest standard
-
- Resolved
-
- links to