-
Bug
-
Resolution: Fixed
-
P2
-
1.1.5
-
1.1.6
-
x86
-
windows_95
-
Verified
Issue | Fix Version | Assignee | Priority | Status | Resolution | Resolved In Build |
---|---|---|---|---|---|---|
JDK-2018106 | 1.2.0 | Btplusnull User | P2 | Resolved | Fixed | 1.2beta3 |
Name: bb33257 Date: 11/25/97
In addition to the problems reported in bug #4068133, there are
quite a few characters that the BreakIterator returned by
BreakIterator.getLineInstance() doesn't treat correctly in the
presence of CJK ideographs. To see the problem, run the
following code:
public void TestJapaneseLineBreak()
{
StringBuffer testString = new StringBuffer("\u4e00x\u4e8c");
String precedingChars = "([{«$¥£¤\u2018\u201a\u201c\u201e\u201b\u201f";
String followingChars = ")]}»!%,.\u3001\u3002\u3063\u3083\u3085\u3087\u30c3\u30e3\u30e5\u30e7\u30fc:;\u309b\u309c\u3005\u309d\u309e\u30fd\u30fe\u2019\u201d\u00b0\u2032\u2033\u2034\u2030\u2031\u2103\u2109\u00a2\u0300\u0301\u0302";
BreakIterator iter = BreakIterator.getLineInstance(Locale.JAPAN);
for (int i = 0; i < precedingChars.length(); i++) {
testString.setCharAt(1, precedingChars.charAt(i));
iter.setText(testString.toString());
int j = iter.first();
if (j != 0)
errln("ja line break failure: failed to start at 0");
j = iter.next();
if (j != 1)
errln("ja line break failure: failed to stop before '" + precedingChars.charAt(i)
+ "' (" + ((int)(precedingChars.charAt(i))) + ")");
j = iter.next();
if (j != 3)
errln("ja line break failure: failed to skip position after '" + precedingChars.charAt(i)
+ "' (" + ((int)(precedingChars.charAt(i))) + ")");
}
for (int i = 0; i < followingChars.length(); i++) {
testString.setCharAt(1, followingChars.charAt(i));
iter.setText(testString.toString());
int j = iter.first();
if (j != 0)
errln("ja line break failure: failed to start at 0");
j = iter.next();
if (j != 2)
errln("ja line break failure: failed to skip position before '" + followingChars.charAt(i)
+ "' (" + ((int)(followingChars.charAt(i))) + ")");
j = iter.next();
if (j != 3)
errln("ja line break failure: failed to stop after '" + followingChars.charAt(i)
+ "' (" + ((int)(followingChars.charAt(i))) + ")");
}
}
The following "preceding" characters don't get treated correctly:
\u0024 The ASCII dollar sign
\u00a3 The British pound sign
\u00a4 The generic currency symbol
\u00a5 The yen sign
The following "following" characters don't get treated correctly:
\u3063, \u3083, \u3085, \u3087, \u30c3, \u30e3, \u30e5, \u30e7 The small Kana characters
\u30fc The Katakana long-vowel mark
\u309b, \u309c The Kana voiced/semi-voiced sound marks
\u3005, \u309d, \u309e, \u30fd, \u30fe The CJK iteration marks
\u00b0 The degree sign
\u2032, \u2033, \u2034 Prime marks
\u2103, \u2019 Degrees Celsius and degrees Fahrenheit
\u00a2 The cents sign
\u0300, \u0301, \u0302 All non-spacing marks
======================================================================
- backported by
-
JDK-2018106 CJK line-breaking not completely correct
- Resolved