Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8230188

Upgrade Character.isUnicodeIdentifierStart/Part() methods to the latest standard

XMLWordPrintable

    • Icon: CSR CSR
    • Resolution: Approved
    • Icon: P4 P4
    • 14
    • core-libs
    • None
    • behavioral
    • low
    • Other_ID_{Start|Continue} characters become permissible as Unicode Identifier characters.
    • Java API
    • SE

      Summary

      Upgrade the specification and implementation of java.lang.Character#isUnicodeIdentifier{Start|Part}({char|int}) methods to the latest Unicode Standard.

      Problem

      The current specification is unclear as to the purpose of the methods, and the level of support the implementation conforms to.

      Solution

      Revise the specification to the 12.0 level of Unicode Standard Annex #31 R1: Default Identifiers. This change will not affect isJavaIdentifier{Start|Part}({char|int}) methods in any way.

      Specification

      Append the following line to the list of permissible characters condition, in the method descriptions of isUnicodeIdentifierStart({char|int}):

       * <li> it is an <a href="http://www.unicode.org/reports/tr44/#Other_ID_Start">
       *      {@code Other_ID_Start}</a> character.

      Add the following paragraph after the list of permissible characters condition, in the method descriptions of isUnicodeIdentifierStart({char|int}):

       * This method conforms to <a href="https://unicode.org/reports/tr31/#R1">
       * UAX31-R1: Default Identifiers</a> requirement of the Unicode Standard,
       * with the following profile of UAX31:
       * <pre>
       * Start := ID_Start + 'VERTICAL TILDE' (U+2E2F)
       * </pre>
       * {@code 'VERTICAL TILDE'} is added to {@code Start} for backward
       * compatibility.

      Append the following lines to the list of permissible characters condition, in the method descriptions of isUnicodeIdentifierPart({char|int}):

       * <li> it is an <a href="http://www.unicode.org/reports/tr44/#Other_ID_Start">
       *      {@code Other_ID_Start}</a> character.
       * <li> it is an <a href="http://www.unicode.org/reports/tr44/#Other_ID_Continue">
       *      {@code Other_ID_Continue}</a> character.

      Add the following paragraph after the list of permissible characters condition, in the method descriptions of isUnicodeIdentifierPart({char|int}). (int version replaces the char argument to isIdentifierIgnorable with int):

       * This method conforms to <a href="https://unicode.org/reports/tr31/#R1">
       * UAX31-R1: Default Identifiers</a> requirement of the Unicode Standard,
       * with the following profile of UAX31:
       * <pre>
       * Continue := Start + ID_Continue + ignorable
       * Medial := empty
       * ignorable := isIdentifierIgnorable(char) returns true for the character
       * </pre>
       * {@code ignorable} is added to {@code Continue} for backward
       * compatibility.

            naoto Naoto Sato
            naoto Naoto Sato
            Roger Riggs
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: