Issue | Fix Version | Assignee | Priority | Status | Resolution | Resolved In Build |
---|---|---|---|---|---|---|
JDK-2036864 | 1.4.0 | William Harnois | P4 | Resolved | Fixed | beta |
JDK-2036863 | 1.3.1 | William Harnois | P4 | Closed | Fixed | ladybird |
HTMLConverter running on Solaris/ja_JP.UTF-8.
Choice source file of simple HTML source include japanese kanji string.(euc or sjis)
Start convert procedure, Soon stop progress and output log.
And additional problem,
Broken japanese string in case HTMLConverter running on solaris's locale not same encode locale of HTML file.
Write "charset" meta in HTML head section, Convert success of locale is "ja" and "ja_JP.PCK".
Result matrix of HTML convert test:
HTML charset\Solaris Locale
ja ja_JP.PCK ja_JP.UTF-8
eucJP pass *1 *2
shift_jis *1 pass *2
utf-8 *1 *1 pass
*1...Convert is finish, but broken japanese character.
*2...Convert not finish, output log.
Attached sample HTML files.
(some japanese string encode pattern files.
Locale euc/sjis/utf-8, Header meta charset in and out)
output log: (on Solaris/ja_JP.UTF-8)
sun.io.MalformedInputException
at sun.io.ByteToCharUTF8.convert(ByteToCharUTF8.java:152)
at java.io.InputStreamReader.convertInto(InputStreamReader.java:137)
at java.io.InputStreamReader.fill(InputStreamReader.java:186)
at java.io.InputStreamReader.read(InputStreamReader.java:249)
at java.io.BufferedReader.fill(BufferedReader.java:139)
at java.io.BufferedReader.read(BufferedReader.java:157)
at java.io.StreamTokenizer.read(StreamTokenizer.java:472)
at java.io.StreamTokenizer.nextToken(StreamTokenizer.java:516)
at sun.plugin.converter.util.StdUtils.countWords(StdUtils.java:109)
at sun.plugin.converter.engine.PluginConverter.runConversion(PluginConverter.java:314)
at sun.plugin.converter.engine.PluginConverter.run(PluginConverter.java:250)
at java.lang.Thread.run(Thread.java:484)
Use converter version:
-rw-rw-r-- 1 on113181 staff 187237 Sep 8 06:24 htmlconv1-3.jar
Traget environment:
Solaris 8, 7 both intel and sparc.
osamu.numayama@Japan 2000-09-08
---------------------------------------------------------------------------
HTML documents which is specified the character encoding in META tag are
converted properly except the encoded document in UTF-8 on Solaris.
System locale \ HTML chaset | eucJP | shift_jis | UTF-8
------------------+---------+-----------+-----------+-----------+
| euc | OK | OK | OK
Solaris8-sparc | pck | OK | OK | OK
| utf-8 | NG* | NG* | OK
------------------+---------+-----------+-----------+-----------+
| euc | OK | OK | OK
Solaris7-IA | pck | OK | OK | OK
| utf-8 | NG* | NG* | OK
------------------+---------+-----------+-----------+-----------+
Windows98 | sjis | OK | OK | OK
------------------+---------+-----------+-----------+-----------+
WindowsNT | sjis | OK | OK | OK
------------------+---------+-----------+-----------+-----------+
Redhat Linux 6.2J | euc | OK | OK | OK
------------------+---------+-----------+-----------+-----------+
(*) The MalformedInputException above occurs and does nothing to the HTML
files, not converted.
kenichi.kurosaki@Japan 2000-11-09
Choice source file of simple HTML source include japanese kanji string.(euc or sjis)
Start convert procedure, Soon stop progress and output log.
And additional problem,
Broken japanese string in case HTMLConverter running on solaris's locale not same encode locale of HTML file.
Write "charset" meta in HTML head section, Convert success of locale is "ja" and "ja_JP.PCK".
Result matrix of HTML convert test:
HTML charset\Solaris Locale
ja ja_JP.PCK ja_JP.UTF-8
eucJP pass *1 *2
shift_jis *1 pass *2
utf-8 *1 *1 pass
*1...Convert is finish, but broken japanese character.
*2...Convert not finish, output log.
Attached sample HTML files.
(some japanese string encode pattern files.
Locale euc/sjis/utf-8, Header meta charset in and out)
output log: (on Solaris/ja_JP.UTF-8)
sun.io.MalformedInputException
at sun.io.ByteToCharUTF8.convert(ByteToCharUTF8.java:152)
at java.io.InputStreamReader.convertInto(InputStreamReader.java:137)
at java.io.InputStreamReader.fill(InputStreamReader.java:186)
at java.io.InputStreamReader.read(InputStreamReader.java:249)
at java.io.BufferedReader.fill(BufferedReader.java:139)
at java.io.BufferedReader.read(BufferedReader.java:157)
at java.io.StreamTokenizer.read(StreamTokenizer.java:472)
at java.io.StreamTokenizer.nextToken(StreamTokenizer.java:516)
at sun.plugin.converter.util.StdUtils.countWords(StdUtils.java:109)
at sun.plugin.converter.engine.PluginConverter.runConversion(PluginConverter.java:314)
at sun.plugin.converter.engine.PluginConverter.run(PluginConverter.java:250)
at java.lang.Thread.run(Thread.java:484)
Use converter version:
-rw-rw-r-- 1 on113181 staff 187237 Sep 8 06:24 htmlconv1-3.jar
Traget environment:
Solaris 8, 7 both intel and sparc.
osamu.numayama@Japan 2000-09-08
---------------------------------------------------------------------------
HTML documents which is specified the character encoding in META tag are
converted properly except the encoded document in UTF-8 on Solaris.
System locale \ HTML chaset | eucJP | shift_jis | UTF-8
------------------+---------+-----------+-----------+-----------+
| euc | OK | OK | OK
Solaris8-sparc | pck | OK | OK | OK
| utf-8 | NG* | NG* | OK
------------------+---------+-----------+-----------+-----------+
| euc | OK | OK | OK
Solaris7-IA | pck | OK | OK | OK
| utf-8 | NG* | NG* | OK
------------------+---------+-----------+-----------+-----------+
Windows98 | sjis | OK | OK | OK
------------------+---------+-----------+-----------+-----------+
WindowsNT | sjis | OK | OK | OK
------------------+---------+-----------+-----------+-----------+
Redhat Linux 6.2J | euc | OK | OK | OK
------------------+---------+-----------+-----------+-----------+
(*) The MalformedInputException above occurs and does nothing to the HTML
files, not converted.
kenichi.kurosaki@Japan 2000-11-09
- backported by
-
JDK-2036864 HTML Converter file reading error and broken japanese string
- Resolved
-
JDK-2036863 HTML Converter file reading error and broken japanese string
- Closed