-
Bug
-
Resolution: Fixed
-
P2
-
5.0
-
b57
-
generic
-
generic
The charsets ISO-20220-KR, ISO-2022-CN-CNS, ISO-2022-CN-GB sometimes
throw BufferOverflowException, especially when encoding one char.
For example, the following test program:
--------------------------------------------------------------
import java.io.*;
import java.util.*;
import java.nio.charset.*;
import java.nio.*;
public class FindOneCharEncoderBugs {
public static void main(String[] args) throws Exception {
for (Map.Entry<String,Charset> e
: Charset.availableCharsets().entrySet()) {
String csn = e.getKey();
Charset cs = e.getValue();
int failures = 0;
System.out.println(csn);
if (csn.equals("x-IBM933")) continue; // hangs!
// Ignore decoder-only charsets
try { cs.newEncoder(); }
catch (UnsupportedOperationException x) { continue; }
for (int i = 0; failures < 5 && i <= 0xffff; i++) {
String s = new String(new char[] { (char)i });
try {
s.getBytes(csn);
} catch (BufferOverflowException x) {
System.out.printf("Overflow: charset=%s char=%x%n", csn, i);
failures++;
i += 100;
} catch (Throwable t) {
System.out.printf("%s charset=%s char=%x%n", t, csn, i);
i += 100;
}
}
}
}
}
--------------------------------------------------------------
prints (among other things):
Overflow: charset=ISO-2022-KR char=a1
Overflow: charset=ISO-2022-KR char=111
Overflow: charset=ISO-2022-KR char=2c7
Overflow: charset=ISO-2022-KR char=391
Overflow: charset=ISO-2022-KR char=401
Overflow: charset=x-ISO-2022-CN-CNS char=a7
Overflow: charset=x-ISO-2022-CN-CNS char=2c7
Overflow: charset=x-ISO-2022-CN-CNS char=391
Overflow: charset=x-ISO-2022-CN-CNS char=2013
Overflow: charset=x-ISO-2022-CN-CNS char=2103
Overflow: charset=x-ISO-2022-CN-GB char=a4
Overflow: charset=x-ISO-2022-CN-GB char=113
Overflow: charset=x-ISO-2022-CN-GB char=1ce
Overflow: charset=x-ISO-2022-CN-GB char=2c7
Overflow: charset=x-ISO-2022-CN-GB char=391
It is particularly egregious that strings consisting of one ASCII character
cannot be correctly encoded.
###@###.### 2004-06-06
throw BufferOverflowException, especially when encoding one char.
For example, the following test program:
--------------------------------------------------------------
import java.io.*;
import java.util.*;
import java.nio.charset.*;
import java.nio.*;
public class FindOneCharEncoderBugs {
public static void main(String[] args) throws Exception {
for (Map.Entry<String,Charset> e
: Charset.availableCharsets().entrySet()) {
String csn = e.getKey();
Charset cs = e.getValue();
int failures = 0;
System.out.println(csn);
if (csn.equals("x-IBM933")) continue; // hangs!
// Ignore decoder-only charsets
try { cs.newEncoder(); }
catch (UnsupportedOperationException x) { continue; }
for (int i = 0; failures < 5 && i <= 0xffff; i++) {
String s = new String(new char[] { (char)i });
try {
s.getBytes(csn);
} catch (BufferOverflowException x) {
System.out.printf("Overflow: charset=%s char=%x%n", csn, i);
failures++;
i += 100;
} catch (Throwable t) {
System.out.printf("%s charset=%s char=%x%n", t, csn, i);
i += 100;
}
}
}
}
}
--------------------------------------------------------------
prints (among other things):
Overflow: charset=ISO-2022-KR char=a1
Overflow: charset=ISO-2022-KR char=111
Overflow: charset=ISO-2022-KR char=2c7
Overflow: charset=ISO-2022-KR char=391
Overflow: charset=ISO-2022-KR char=401
Overflow: charset=x-ISO-2022-CN-CNS char=a7
Overflow: charset=x-ISO-2022-CN-CNS char=2c7
Overflow: charset=x-ISO-2022-CN-CNS char=391
Overflow: charset=x-ISO-2022-CN-CNS char=2013
Overflow: charset=x-ISO-2022-CN-CNS char=2103
Overflow: charset=x-ISO-2022-CN-GB char=a4
Overflow: charset=x-ISO-2022-CN-GB char=113
Overflow: charset=x-ISO-2022-CN-GB char=1ce
Overflow: charset=x-ISO-2022-CN-GB char=2c7
Overflow: charset=x-ISO-2022-CN-GB char=391
It is particularly egregious that strings consisting of one ASCII character
cannot be correctly encoded.
###@###.### 2004-06-06
- relates to
-
JDK-5058184 Problems with encoders for IBM933, IBM949, IBM949C, IBM970
- Resolved