Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-4217441

String.toLowerCase() doesn't handle Greek sigma

XMLWordPrintable

    • tiger
    • generic, x86
    • generic, windows_nt



      Name: sg39081 Date: 03/04/99


       From examining the code in String.java (in the current JDK 1.2.2
      build), it appears that String.toLowerCase() does not correctly
      handle the Greek capital letter sigma. The lower-case sigma has
      two presentation forms: initial/medial and final, which are
      represented by different Unicode code-point values. Translation
      to lower case it thus context-sensitive: If the character
      following the sigma is a letter, use the initial/medial form;
      otherwise, use the final form. The current logic relies on the
      Unicode character database, which will always return the
      initial/medial form.

      To reproduce the problem, use the following code:

      public class SigmaTest {
          public static void main(String[] args) {
              String input = "\u0399\u0395\u03a3\u03a5\u03a3 \u03a7\u03a1\u0399\u03a3\u03a4\u039f\u03a3";
                      // "IESUS XRISTOS"

              String output = input.toLowerCase();

              if (output.equals("\u03b9\u03b5\u03c3\u03c5\u03c2 \u03c7\u03c1\u03b9\u03c3\u03c4\u03bf\u03c2"))
                  System.out.println("PASS");
              else {
                  for (int i = 0; i < output.length(); i++)
                      System.out.print(" " + Integer.toHexString((int)(output.charAt(i))));
                  System.out.println();
              }
          }
      }

      This program produces the following output:

      3b9 3b5 3c3 3c5 3c3 20 3c7 3c1 3b9 3c3 3c4 3bf 3c3

      The 3c3 at the end of the string and the one before the space
      should both be 3c2. (The other two 3c3's should still be 3c3.)
      (Review ID: 53991)
      ======================================================================

      Name: skT88420 Date: 12/16/99


      java version "1.2.2"
      Classic VM (build JDK-1.2.2-W, native threads, symcjit)


      Greek letters small and capital are displayed correctly,
      but converting small greek letters (unicode) to upper case
      results in the display of the latin version of the characters,
      i.e. small pi converts to capitel P. small sigma to capital S
      etc.
      (Review ID: 99105)
      ======================================================================

      Additional test case from duplicate 4519837:

      public class A {
          public static void main(String[] argv) {
              String checkedString = "\u03A30";
              String ExpectedLowerString = "\u03C20";
              if(!checkedString.toLowerCase().equals(ExpectedLowerString))
                  System.out.println("Incorrect lowercase");
              else
                  System.out.println("ok");
          }
      }

      ======================================================================

            naoto Naoto Sato
            sgoodsunw Sheri Good (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated:
              Resolved:
              Imported:
              Indexed: