-
Bug
-
Resolution: Not an Issue
-
P4
-
None
-
1.4.2
-
x86
-
windows_xp
Name: gm110360 Date: 04/07/2003
FULL PRODUCT VERSION :
java version "1.4.2-beta"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.2-beta-b19)
Java HotSpot(TM) Client VM (build 1.4.2-beta-b19, mixed mode)
FULL OS VERSION :
Microsoft Windows XP [Version 5.1.2600]
(Note: also shown error on win98 2nd edition)
A DESCRIPTION OF THE PROBLEM :
Parsing a large file with many entities using SAX or DOM, an exception will be thrown: org.xml.sax.SAXException: Fatal Error: URI=null Line=595: Parser has reached the entity expansion limit "64,000" set by the Application.
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Run the source. Please email me for example test file. (testfile.xml)
In case you don't want to email me for the file, here is how to create one:
1) create an testfile.xml in the same directory where you run the code
2) Paste the following:
<?xml version='1.0' encoding='utf-8'?>
<!--DTD for vocab -->
<!DOCTYPE FirstNode [
ELEMENT FirstNode (ChildNode)*
ELEMENT ChildNode (#PCDATA)
]>
<FirstNode>
<ChildNode>
<html><body><a name="1"></a>
<p><b>concinnity</b></p>
<blockquote>concinnity was Word of the Day on <a href="http://www.dictionary.com/wordoftheday/archive/2001/08/18.html">August 18, 2001</a>.</blockquote><br>
<table border="0" cellpadding="0" cellspacing="0" width="100%"><tr><td class="src"><a href="/search?q=00-database-info&db=wotd" title="Click for more information about this dictionary">Source</a>: <cite>Dictionary.com Word of the Day</cite></td></tr></table>
<a name="2"></a>
<TABLE><TR><TD><A NAME="C0548200"><B>con·cin·ni·ty</B></A> <A TITLE="Click for guide to symbols." onClick="ahdpop();return false;" HREF="/help/ahd4/pronkey.html" CLASS="linksrc"><b>Pronunciation Key</b></A> (k<IMG ALT="" SRC="pronkey_files/schwa.gif" height="15" width="6" ALIGN="ABSBOTTOM">n-s<IMG
ALT="" SRC="pronkey_files/ibreve.gif" height="15" width="7" ALIGN="ABSBOTTOM">n<IMG ALT="" SRC="pronkey_files/prime.gif" height="22" width="4" ALIGN="ABSBOTTOM"><IMG ALT="" SRC="pronkey_files/ibreve.gif" height="15" width="7" ALIGN^F
quot; SRC="pronkey_files/emacr.gif" height="15" width="7" ALIGN="ABSBOTTOM">)<BR>
<I>n.</I> <I>pl.</I> <B>con·cin·ni·ties </B><OL><LI> Harmony in the arrangement or interarrangement of parts with respect to a whole.</LI>
<LI> Studied elegance and facility in style of expression: “He has what one character calls ‘the gifts of concinnity and concision,’ that deft swipe with a phrase that can be so
devastating in children” (Elizabeth Ward).
</LI>
<LI>An instance of harmonious arrangement or studied elegance and facility.</LI>
</OL><BR>
<HR ALIGN="left" WIDTH="25%">[From Latin<TT> concinnit<IMG ALT="" SRC="pronkey_files/amacr.gif" height="15" width="7" ALIGN="ABSBOTTOM">s</TT>, from<TT> concinn<IMG ALT="" SRC="pronkey_files/amacr.gif" height="15" width="7" ALIGN="ABSBOTTOM">re</TT>, <I>to put in order</I>,
from<TT> concinnus</TT>, <I>deftly joined</I>.]</TD>
</TR></TABLE>
<a name="3"></a>
<b>concinnity</b><br><br>
\Con*cin"ni*ty\, n. [L. concinnitas, fr. concinnus
skillfully put together, beautiful. Of uncertain origin.]
Internal harmony or fitness; mutual adaptation of parts;
elegance; -- used chiefly of style of discourse. [R.]
<br><br>
An exact concinnit
;<table border="0" cellpadding="0" cellspacing="0" width="100%"><tr><td class="src"><a href="/search?q=00-database-info&db=web1913" title="Click for more information about this dictionary">Source</a>: <cite>Webster's Revised Unabridged Dictionary, © 1996, 1998 MICRA, Inc.</cite></td></tr></table>
</body></html>
</ChildNode>
</FirstNode>
3) Repeatedly copy and paste the <ChildNode>...</ChildNode> content for about 196 times inside the <FirstNode>..</FirstNode>
When you run, the error happens after reading about 195 ChildNode.
You can change line 30 and 31 of source:
test.DOMRead();
//test.SAXRead();
to:
//test.DOMRead();
test.SAXRead();
to test SAX error. In both cases, an exception was generated.
EXPECTED VERSUS ACTUAL BEHAVIOR :
No error.
Exception when run
ERROR MESSAGES/STACK TRACES THAT OCCUR :
org.xml.sax.SAXException: Fatal Error: URI=null Line=595: Parser has reached the entity expansion limit "64,000" set by the Application.
at TErrorHandler.fatalError(XMLError.java:198)
at org.apache.crimson.parser.Parser2.fatal(Parser2.java:3342)
at org.apache.crimson.parser.Parser2.fatal(Parser2.java:3333)
at org.apache.crimson.parser.Parser2.expandEntityInContent(Parser2.java:2667)
at org.apache.crimson.parser.Parser2.maybeReferenceInContent(Parser2.java:2569)
at org.apache.crimson.parser.Parser2.content(Parser2.java:1980)
at org.apache.crimson.parser.Parser2.maybeElement(Parser2.java:1654)
at org.apache.crimson.parser.Parser2.content(Parser2.java:1926)
at org.apache.crimson.parser.Parser2.maybeElement(Parser2.java:1654)
at org.apache.crimson.parser.Parser2.parseInternal(Parser2.java:634)
at org.apache.crimson.parser.Parser2.parse(Parser2.java:333)
at org.apache.crimson.parser.XMLReaderImpl.parse(XMLReaderImpl.java:448)
at org.apache.crimson.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:185)
at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:76)
at XMLError.DOMRead(XMLError.java:101)
at XMLError.main(XMLError.java:30)
REPRODUCIBILITY :
This bug can be reproduced always.
---------- BEGIN SOURCE ----------
import java.util.*;
import org.w3c.dom.*;
import java.io.*;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.FactoryConfigurationError;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.*;
import org.xml.sax.SAXException;
import org.xml.sax.SAXParseException;
import org.xml.sax.*;
import org.xml.sax.helpers.*;
import org.w3c.dom.*;
import org.w3c.dom.Document;
import org.w3c.dom.DOMException;
public class XMLError {
private String fname = null;
public XMLError(String fname) {
this.fname = fname;
}
public static void main(String [] argv){
XMLError test = new XMLError("testfile.xml");
test.DOMRead();
//test.SAXRead();
}
public void SAXRead(){
System.out.println("Reading " + fname + "...");
String data = readFile(fname);
if(data == null){
System.out.println("There is no such file as " + fname);
return;
}
try{
SAXParserFactory factory = SAXParserFactory.newInstance();
factory.setValidating(true);
SAXParser parser = factory.newSAXParser();
//org.xml.sax.helpers.DefaultHandler
parser.parse(new ByteArrayInputStream(data.getBytes()), new DefaultHandler(){
private CharArrayWriter contents = new CharArrayWriter();
private int count;
public void characters(char[] ch, int start, int length){
contents.write( ch, start, length );
}
public void endDocument(){
System.out.println("Finish: " + count);
}
public void endElement(String uri, String localName, String qName) {
if ( qName.equals( "ChildNode" ) ) {
count++;
String str = contents.toString();
System.out.println("Importing... " + count + " : " + str);
}
}
public void startDocument(){
//contents.reset();
count = 0;
}
public void startElement(String uri, String localName, String qName, Attributes attributes){
contents.reset();
//System.out.println("The name: " + localName + ", qName: " + qName);
}
});
}catch(Exception ee){
ee.printStackTrace();
}
}
public void DOMRead(){
System.out.println("Reading " + fname + "...");
String data = readFile(fname);
if(data == null){
System.out.println("There is no such file as " + fname);
return;
}
int count = 0;
try {
TErrorHandler error = new TErrorHandler();
DocumentBuilderFactory factory =
DocumentBuilderFactory.newInstance();
factory.setValidating(true);
factory.setIgnoringElementContentWhitespace(true);
//factory.setNamespaceAware(true);
//factory.setExpandEntityReferences(false);
System.out.println("Parsing xml data...");
DocumentBuilder builder = factory.newDocumentBuilder();
builder.setErrorHandler(error);
Document document = builder.parse(new ByteArrayInputStream(data.getBytes()));
Node node;
node = document.getFirstChild();
if(node == null){
return;
}
System.out.println("Start importing data: ");
while(node != null){
if(node.getNodeType() == Node.ELEMENT_NODE){
if("FirstNode".equalsIgnoreCase(node.getNodeName())) break;
}
node = node.getNextSibling();
}
node = node.getFirstChild();
String str = null;
boolean done = false;
while((node != null) && (!done)){
str = getValue(node);
if(str == null) break;
node = node.getNextSibling();
count++;
if((count % 10) == 0){
System.out.print(".");
}
}
}catch(Exception e){
e.printStackTrace();
}
System.out.println("\n\nDone: " + count);
}
static public String getValue(Node node){
if(node == null) return null;
Node node2 = node.getFirstChild();
if(node2 == null){
return "";
}
if(node2.getNodeType() != Node.TEXT_NODE) return null;
return node2.getNodeValue();
}
public static String readFile(String fname){
if((fname == null) || (fname.trim().length() <= 0)){
return null;
}
BufferedReader in = null;
String str;
StringBuffer buf = new StringBuffer();
try{
in = new BufferedReader(new FileReader(fname));
while(in.ready()){
str = in.readLine();
if(str == null) break;
buf.append(str + "\n");
}
in.close();
}catch(IOException e){
//e.printStackTrace();
return null;
}
return buf.toString();
}
}
class TErrorHandler implements ErrorHandler {
int errNo = 0;
String errMessage = "";
public void resetError(){
errNo = 0;
errMessage = "";
}
public void setError(String mesg){
errNo = 1;
if(mesg == null) return;
errMessage = errMessage + "\n" + mesg;
}
TErrorHandler() {
}
private String getParseExceptionInfo(SAXParseException spe) {
String systemId = spe.getSystemId();
if (systemId == null) {
systemId = "null";
}
String info = "URI=" + systemId + " Line=" + spe.getLineNumber() +
": " + spe.getMessage();
return info;
}
public void warning(org.xml.sax.SAXParseException sAXParseException) throws org.xml.sax.SAXException {
setError("Warning: " + getParseExceptionInfo(sAXParseException));
}
public void error(org.xml.sax.SAXParseException sAXParseException) throws org.xml.sax.SAXException {
String message = "Error: " + getParseExceptionInfo(sAXParseException);
throw new SAXException(message);
}
public void fatalError(org.xml.sax.SAXParseException sAXParseException) throws org.xml.sax.SAXException {
String message = "Fatal Error: " + getParseExceptionInfo(sAXParseException);
throw new SAXException(message);
}
}
---------- END SOURCE ----------
CUSTOMER SUBMITTED WORKAROUND :
None
(Review ID: 183616)
======================================================================
###@###.### 2004-07-13