Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-6185419

Unicode behavior change in Character.isLetter() post mantis

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Not an Issue
    • Icon: P3 P3
    • None
    • 5.0, 6
    • core-libs
    • None
    • generic, sparc
    • solaris_9

      A major Sun customer,
      cannot move to Tiger because of a change in behavior in Character.isLetter()

      Using the code snippet provided by the customer the behavior of
      isLetter() changes post Mantis as follows:

      tmarble@fred 38% pwd
      /home/tmarble/javaperf/2004/TLR/Mantis
      tmarble@fred 39% /usr/java/j2sdk1.4.2_07/bin/javac UnicodeTest.java
      tmarble@fred 40% /usr/java/j2sdk1.4.2_07/bin/java UnicodeTest
      is letter: false
      is digit: false
      tmarble@fred 41% cd ../Tiger
      /home/tmarble/javaperf/2004/TLR/Tiger
      tmarble@fred 42% /usr/java/jdk1.5.0_02/bin/javac UnicodeTest.java
      tmarble@fred 43% /usr/java/jdk1.5.0_02/bin/java UnicodeTest
      is letter: true
      is digit: false
      tmarble@fred 44% cd ../Mustang
      /home/tmarble/javaperf/2004/TLR/Mustang
      tmarble@fred 45% /usr/java/jdk1.6.0/bin/javac UnicodeTest.java
      tmarble@fred 46% /usr/java/jdk1.6.0/bin/java UnicodeTest
      is letter: true
      is digit: false
      tmarble@fred 47%

      This is a bug because the behavior changed.
      As the character in question is a modifier character I'm not
      sure what the "right" behavior is, but I suspect that this may
      relate to correct interpretation of unicode. FFI see:

         I found the relavent Unicode map here:
           http://www.unicode.org/charts/U02B0.pdf
         There is further discussion of that specific character here:
           http://www.tachyonsoft.com/uc0002.htm#U02C6
           http://www.fileformat.info/info/unicode/char/02c6/index.htm
         There is also a discussion of Unicode version 4:
           http://www.unicode.org/versions/Unicode4.0.1/

      Please note that according to bug 5034599 Unicode 4.0.1
      will be delayed until Mustang. HOWEVER it is not clear that
      this is a Unicode 4 issue.

      And correct behavior for this one unicode character may not
      indicate correctness of the universe of possibile letters
      (correctness of isLetter() must be reviewed in the general case).

      --Tom
      ###@###.### 10/27/04 20:07 GMT
      ###@###.### 10/27/04 22:30 GMT


      The test source code as provided by ###@###.###:

      public class UnicodeTest {

       public static void main (String argv[]) {
         char myUnicodeCharacter = (char) Integer.parseInt("2C6", 16);

         System.out.println("is letter: " +
      Character.isLetter(myUnicodeCharacter));
         System.out.println("is digit: " +
      Character.isDigit(myUnicodeCharacter));
       }
      }
      ###@###.### 10/28/04 17:40 GMT

            nlindenbsunw Norbert Lindenberg (Inactive)
            tmarble Tom Marble (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated:
              Resolved:
              Imported:
              Indexed: