-
Enhancement
-
Resolution: Duplicate
-
P3
-
None
-
1.2.0
-
generic
-
generic
The ByteToCharEUC_JP code converter throws StringIndexOutOfBoundsException as
follows.
java.lang.StringIndexOutOfBoundsException: String index out of range: -129
at java.lang.String.charAt(String.java:392)
at sun.io.ByteToCharEUC_JP.getUnicode(ByteToCharEUC_JP.java:72)
at sun.io.ByteToCharEUC_JP.convert(ByteToCharEUC_JP.java:150)
at sun.io.ByteToCharJISAutoDetect.convert(ByteToCharJISAutoDetect.java:140)
at java.io.InputStreamReader.convertInto(InputStreamReader.java:123)
at java.io.InputStreamReader.fill(InputStreamReader.java:152)
at java.io.InputStreamReader.read(InputStreamReader.java:229)
at java.io.Reader.read(Reader.java:103)
at sunw.html.Parser.readCh(Parser.java:1983)
at sunw.html.Parser.parseAttributeValue(Parser.java:1223)
at sunw.html.Parser.parseAttributeSpecificationList(Parser.java:1270)
at sunw.html.Parser.parseTag(Parser.java:1712)
at sunw.html.Parser.parseContent(Parser.java:1807)
at sunw.html.Parser.parse(Parser.java:1922)
at sunw.hotjava.doc.DocParser.run(DocParser.java:663)
at java.lang.Thread.run(Thread.java:490)
Currently there is a bug in ByteToCharJISAutoDetect that determines the
character encoding wrong and invokes the EUC_JP code converter. The EUC_JP
converter used to detect the wrong encoding and throw MalformedInputException.
But it now throws StringIndexOutOfBoundsException.
To reproduce, run hotjava in ja locale. Go to www.cnnfn.com or any web page
which has the ISO8859-1 copyright symbol like "Copyright (c) ". The auto
detect converter determines the (single-byte) character is the first byte
of an EUC JP (G1) character.
To make sure it is reproducible:
set the code converter to Japanese Auto
Detect and specify "-log /dev/tty" for hotjava
cindy.jao@eng 1998-03-23
follows.
java.lang.StringIndexOutOfBoundsException: String index out of range: -129
at java.lang.String.charAt(String.java:392)
at sun.io.ByteToCharEUC_JP.getUnicode(ByteToCharEUC_JP.java:72)
at sun.io.ByteToCharEUC_JP.convert(ByteToCharEUC_JP.java:150)
at sun.io.ByteToCharJISAutoDetect.convert(ByteToCharJISAutoDetect.java:140)
at java.io.InputStreamReader.convertInto(InputStreamReader.java:123)
at java.io.InputStreamReader.fill(InputStreamReader.java:152)
at java.io.InputStreamReader.read(InputStreamReader.java:229)
at java.io.Reader.read(Reader.java:103)
at sunw.html.Parser.readCh(Parser.java:1983)
at sunw.html.Parser.parseAttributeValue(Parser.java:1223)
at sunw.html.Parser.parseAttributeSpecificationList(Parser.java:1270)
at sunw.html.Parser.parseTag(Parser.java:1712)
at sunw.html.Parser.parseContent(Parser.java:1807)
at sunw.html.Parser.parse(Parser.java:1922)
at sunw.hotjava.doc.DocParser.run(DocParser.java:663)
at java.lang.Thread.run(Thread.java:490)
Currently there is a bug in ByteToCharJISAutoDetect that determines the
character encoding wrong and invokes the EUC_JP code converter. The EUC_JP
converter used to detect the wrong encoding and throw MalformedInputException.
But it now throws StringIndexOutOfBoundsException.
To reproduce, run hotjava in ja locale. Go to www.cnnfn.com or any web page
which has the ISO8859-1 copyright symbol like "Copyright (c) ". The auto
detect converter determines the (single-byte) character is the first byte
of an EUC JP (G1) character.
To make sure it is reproducible:
set the code converter to Japanese Auto
Detect and specify "-log /dev/tty" for hotjava
cindy.jao@eng 1998-03-23
- duplicates
-
JDK-4121358 compile test code under /test throws StringIndexOutOfBoundsException.
- Closed
- relates to
-
JDK-4087261 ByteToCharJISAutoDetect throws MalformedInputException with ISO8859-1 text
- Closed