Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-4423383

[Col] Unexpected results with RuleBasedCollator in German locale

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Not an Issue
    • Icon: P4 P4
    • None
    • 1.3.0
    • globalization

      Name: boT120536 Date: 03/08/2001


      java version "1.3.0_01"
      Java(TM) 2 Runtime Environment, Standard Edition (build 1.3.0_01)
      Java HotSpot(TM) Client VM (build 1.3.0_01, mixed mode)

      Firstly, the javadoc for the Collator class is wrong. A bug has already been
      submitted for "incorrect javadoc", but it doesn't mention the problem I've
      found.

      The javadoc states: (I've added unicode values to non-native-English characters)

      "in traditional German a-umlaut is treated as though it expanded to two
      characters (expressed as "a,A < b,B ... & ae;?(\u00E3) & AE;?(\u00C3)"). [?
      (\u00E3) and ?(\u00C3) are, of course, the escape sequences for a-umlaut.] "

      The characters in the javadoc are NOT the escape sequences for a-umlaut; they
      correspond to a-tilde. Tracing the javadoc back to JDK1.1, the Class has
      different text for every version.
      JDK 1.2 shows plain characters with no added tilde (i.e. "a,A < b,B ... & ae;a
      & AE;A", while JDK 1.1 shows the correct characters (i.e. "a & ae ; ?(\u00E4) <
      b".

      In practice, I can never get the ?(\u00E4) -> ae equivalence to work (nor does
      it work when I use the a\u0308 unicode combination for the same character. The
      following code snippet illustrates this:
      <PRE>

          public static void main(String[] args) {
              SubmissionTest i18n = new SubmissionTest();
              i18n.test("Schaltfl?chen", "Schaltflaechen");
              i18n.test("Schaltfl\u00E4chen", "Schaltflaechen");
              i18n.test("Schaltfla\u0308chen", "Schaltflaechen");
              i18n.test("Fu?ball", "Fussball");
           }

          public void test(String s1, String s2) {
              Collator collator = Collator.getInstance(Locale.GERMAN);
              collator.setStrength(Collator.SECONDARY);
              System.out.print("[" + s1 + "] and [" + s2 + "] are ");

              if (collator.compare(s1, s2) == 0) {
                  System.out.println("equivalent.");
              } else {
                  System.out.println("NOT equivalent.");
              }
          }
      </PRE>

      The ouput from this is:

      <PRE>
      [Schaltfl?chen] and [Schaltflaechen] are NOT equivalent.
      [Schaltfl?chen] and [Schaltflaechen] are NOT equivalent.
      [Schaltfla?chen] and [Schaltflaechen] are NOT equivalent.
      [Fu?ball] and [Fussball] are equivalent.
      </PRE>
      (Review ID: 118386)
      ======================================================================
      ###@###.### 11/2/04 18:31 GMT

            jtusla Jiri Tusla (Inactive)
            bonealsunw Bret O'neal (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved:
              Imported:
              Indexed: