Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8304162

Unicode ID differs from UAX 31

XMLWordPrintable

    • Icon: Enhancement Enhancement
    • Resolution: Not an Issue
    • Icon: P4 P4
    • None
    • None
    • core-libs
    • None
    • generic
    • generic

      Requested by people at Unicode. In particular, `java.lang.Character.isUnicodeIdentifierStart/Part()` return `true` if :-

      1) in UAX 31, it remove some characters by
      -\p{Pattern_Syntax}-\p{Pattern_White_Space}

      2) in JDK, it added isIdentifierIgnorable

      ICU which is the reference implementation of Unicode specs recently strictly follows the UAX, which means removing some characters from the previous definition/implementation. In JDK, the spec was revised to the latest (at that point) with https://bugs.openjdk.org/browse/JDK-8229831 providing backward compatibility. We might consider aligning our spec/impl to UAX 31 strictly aligning to ICU as the Unicode RI.

            naoto Naoto Sato
            naoto Naoto Sato
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: