Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-5047314

[Col] Collator.compare() runs indefinitely for a certain set of Thai strings

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: P4 P4
    • 7
    • 1.4.2, 6u16, 6-pool
    • core-libs
    • b84
    • x86, sparc
    • solaris_10, windows, windows_xp

        Name: rmT116609 Date: 05/13/2004


        FULL PRODUCT VERSION :
        java version "1.4.2_04"
        Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.2_04-b05)
        Java HotSpot(TM) Client VM (build 1.4.2_04-b05, mixed mode)

        java version "1.5.0-beta"
        Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0-beta-b32c)
        Java HotSpot(TM) Client VM (build 1.5.0-beta-b32c, mixed mode)

        A DESCRIPTION OF THE PROBLEM :
        When using a Thai collator returned from Collator.getInstance(new Locale("th")) , the Collator.compare(string1, string2) method runs forever when string1 and string2 are identical and the string contains only one of the following Thai characters :

          \u0e40
          \u0e41
          \u0e42
          \u0e43
          \u0e44

        Note that the above characters are all special Thai "prefix" vowels.

        STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
        Compile and run the test case.

        EXPECTED VERSUS ACTUAL BEHAVIOR :
        EXPECTED -
        The compare method runs forever and never return.
        ACTUAL -
        The compare method runs forever and never return.

        ERROR MESSAGES/STACK TRACES THAT OCCUR :
        No exception or error occur

        REPRODUCIBILITY :
        This bug can be reproduced always.

        ---------- BEGIN SOURCE ----------
        import java.text.Collator;
        import java.util.Locale;

        public class WordCount {
            public static void main(String[] args) {
                Collator c = Collator.getInstance(new Locale("th"));
                String s = "\u0e40";
                // any one of \u0e40, \u0e41, \u0e42, \u0e43, or \u0e44 will do
                System.out.println(c.compare(s, s)); // runs forever
                System.out.println("never reach here");
            }
        }
        ---------- END SOURCE ----------

        CUSTOMER SUBMITTED WORKAROUND :
        Wrap the thai collator with a hard-code check.
        (Incident Review ID: 265148)
        ======================================================================
        ###@###.### 11/2/04 18:37 GMT
        The OutOfMemoryError is prevalent. We tested on Linux and Windows, on JDK versions 1.5 and 1.6.0_16.

         

        Here is a simple repro case:

        Collator.getInstance(new Locale("th")).getCollationKey("\u0e44");

         

         

        This test written covers the OOM scenarios:

                    Locale thaiLoc = new Locale("th");

                    Collator thaiColl = Collator.getInstance(thaiLoc);

                    String [] oomStrings = { "\u0e44", "\u0e43", "\u0e42", "\u0e41", "\u0e40" };

                    for (int i=0; i < oomStrings.length;i++) {

                      String oom = oomStrings[i];

                      CollationKey key = thaiColl.getCollationKey(oom);

                      assertEquals("string #"+i, oom, key.getSourceString());

                    }

              peytoia Yuka Kamiya (Inactive)
              rmandalasunw Ranjith Mandala (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

                Created:
                Updated:
                Resolved:
                Imported:
                Indexed: