-
Bug
-
Resolution: Not an Issue
-
P4
-
None
-
1.3.0
-
x86
-
windows_nt
Name: boT120536 Date: 03/08/2001
java version "1.3.0_01"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.3.0_01)
Java HotSpot(TM) Client VM (build 1.3.0_01, mixed mode)
Firstly, the javadoc for the Collator class is wrong. A bug has already been
submitted for "incorrect javadoc", but it doesn't mention the problem I've
found.
The javadoc states: (I've added unicode values to non-native-English characters)
"in traditional German a-umlaut is treated as though it expanded to two
characters (expressed as "a,A < b,B ... & ae;?(\u00E3) & AE;?(\u00C3)"). [?
(\u00E3) and ?(\u00C3) are, of course, the escape sequences for a-umlaut.] "
The characters in the javadoc are NOT the escape sequences for a-umlaut; they
correspond to a-tilde. Tracing the javadoc back to JDK1.1, the Class has
different text for every version.
JDK 1.2 shows plain characters with no added tilde (i.e. "a,A < b,B ... & ae;a
& AE;A", while JDK 1.1 shows the correct characters (i.e. "a & ae ; ?(\u00E4) <
b".
In practice, I can never get the ?(\u00E4) -> ae equivalence to work (nor does
it work when I use the a\u0308 unicode combination for the same character. The
following code snippet illustrates this:
<PRE>
public static void main(String[] args) {
SubmissionTest i18n = new SubmissionTest();
i18n.test("Schaltfl?chen", "Schaltflaechen");
i18n.test("Schaltfl\u00E4chen", "Schaltflaechen");
i18n.test("Schaltfla\u0308chen", "Schaltflaechen");
i18n.test("Fu?ball", "Fussball");
}
public void test(String s1, String s2) {
Collator collator = Collator.getInstance(Locale.GERMAN);
collator.setStrength(Collator.SECONDARY);
System.out.print("[" + s1 + "] and [" + s2 + "] are ");
if (collator.compare(s1, s2) == 0) {
System.out.println("equivalent.");
} else {
System.out.println("NOT equivalent.");
}
}
</PRE>
The ouput from this is:
<PRE>
[Schaltfl?chen] and [Schaltflaechen] are NOT equivalent.
[Schaltfl?chen] and [Schaltflaechen] are NOT equivalent.
[Schaltfla?chen] and [Schaltflaechen] are NOT equivalent.
[Fu?ball] and [Fussball] are equivalent.
</PRE>
(Review ID: 118386)
======================================================================
###@###.### 11/2/04 18:31 GMT
java version "1.3.0_01"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.3.0_01)
Java HotSpot(TM) Client VM (build 1.3.0_01, mixed mode)
Firstly, the javadoc for the Collator class is wrong. A bug has already been
submitted for "incorrect javadoc", but it doesn't mention the problem I've
found.
The javadoc states: (I've added unicode values to non-native-English characters)
"in traditional German a-umlaut is treated as though it expanded to two
characters (expressed as "a,A < b,B ... & ae;?(\u00E3) & AE;?(\u00C3)"). [?
(\u00E3) and ?(\u00C3) are, of course, the escape sequences for a-umlaut.] "
The characters in the javadoc are NOT the escape sequences for a-umlaut; they
correspond to a-tilde. Tracing the javadoc back to JDK1.1, the Class has
different text for every version.
JDK 1.2 shows plain characters with no added tilde (i.e. "a,A < b,B ... & ae;a
& AE;A", while JDK 1.1 shows the correct characters (i.e. "a & ae ; ?(\u00E4) <
b".
In practice, I can never get the ?(\u00E4) -> ae equivalence to work (nor does
it work when I use the a\u0308 unicode combination for the same character. The
following code snippet illustrates this:
<PRE>
public static void main(String[] args) {
SubmissionTest i18n = new SubmissionTest();
i18n.test("Schaltfl?chen", "Schaltflaechen");
i18n.test("Schaltfl\u00E4chen", "Schaltflaechen");
i18n.test("Schaltfla\u0308chen", "Schaltflaechen");
i18n.test("Fu?ball", "Fussball");
}
public void test(String s1, String s2) {
Collator collator = Collator.getInstance(Locale.GERMAN);
collator.setStrength(Collator.SECONDARY);
System.out.print("[" + s1 + "] and [" + s2 + "] are ");
if (collator.compare(s1, s2) == 0) {
System.out.println("equivalent.");
} else {
System.out.println("NOT equivalent.");
}
}
</PRE>
The ouput from this is:
<PRE>
[Schaltfl?chen] and [Schaltflaechen] are NOT equivalent.
[Schaltfl?chen] and [Schaltflaechen] are NOT equivalent.
[Schaltfla?chen] and [Schaltflaechen] are NOT equivalent.
[Fu?ball] and [Fussball] are equivalent.
</PRE>
(Review ID: 118386)
======================================================================
###@###.### 11/2/04 18:31 GMT
- relates to
-
JDK-4115499 RFE: Support both traditional and modern sort orders for Spanish and German
-
- Closed
-