-
Bug
-
Resolution: Won't Fix
-
P4
-
None
-
1.4.1
-
x86
-
windows_2000
Name: jk109818 Date: 09/03/2002
FULL PRODUCT VERSION :
java version "1.4.1-rc"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.1-rc-b19)
Java HotSpot(TM) Client VM (build 1.4.1-rc-b19, mixed mode)
FULL OPERATING SYSTEM VERSION : Windows 2000
ADDITIONAL OPERATING SYSTEMS : Probably all
A DESCRIPTION OF THE PROBLEM :
If the HTML Parser (HTMLEditorKit.Parser) is given a .doc
file to parse (by mistake - because it works on IE), it
takes forever to parse it and does not throw an exception.
I imagine a .doc file is sufficiently horrible to look at
for even the most obtuse parser to say that this is not an
HTML file and throw an exception.
Instead of that, it ran for 3516 seconds before finishing
without an error.
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
1. Run the program
2. Wait half an hour (400MHz PII)
3. Read the result
EXPECTED VERSUS ACTUAL BEHAVIOR :
I would expect the parser to throw an Exception to say that
the format is unknown
ERROR MESSAGES/STACK TRACES THAT OCCUR :
None - that's the problem
REPRODUCIBILITY :
This bug can be reproduced always.
---------- BEGIN SOURCE ----------
package monitor;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.net.HttpURLConnection;
import java.net.URL;
import javax.swing.text.html.HTMLDocument;
//import javax.swing.text.html.HTMLDocument.HTMLReader;
import javax.swing.text.html.HTMLEditorKit;
import javax.swing.text.html.HTMLEditorKit.Parser;
import javax.swing.text.html.HTMLEditorKit.ParserCallback;
public class HTMLParserBug {
private InputStreamReader inputStreamReader;
public HTMLParserBug() {
try {
URL url = new URL("http://www.yellow-
b.com/docs/jet/JET_description_30.doc");
HttpURLConnection connection = (HttpURLConnection)url.openConnection
();
InputStream httpInputStream = (InputStream)connection.getContent();
inputStreamReader = new InputStreamReader(httpInputStream);
} catch (Exception e) {
throw new Error("Problems accessing file",e);
}
try {
HTMLEditorKit htmlEditorKit = new HTMLEditorKit();
HTMLDocument htmlDocument = (HTMLDocument)
htmlEditorKit.createDefaultDocument();
Parser parser = htmlDocument.getParser();
ParserCallback htmlReader = htmlDocument.getReader(0);
long startTime = System.currentTimeMillis();
parser.parse(inputStreamReader,htmlReader, true);
long endTime = System.currentTimeMillis();
System.out.println("Finished without error after "+((endTime-
startTime)/1000)+" seconds");
} catch (Exception e) {
throw new Error("Parser has signalled an error",e);
}
}
public static void main (String[] args) {
HTMLParserBug bug = new HTMLParserBug();
}
}
---------- END SOURCE ----------
CUSTOMER WORKAROUND :
Write your own parser
(Review ID: 163473)
======================================================================