FULL PRODUCT VERSION :
java version "1.6.0-ea"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.6.0-ea-b47)
Java HotSpot(TM) Client VM (build 1.6.0-ea-b47, mixed mode, sharing)
A DESCRIPTION OF THE PROBLEM :
As of Mustang b46, if the last character in a CDATA section is ']', the CDATA
end delimiter (CDEnd) is not recognized. More precisely, the CDEnd has to be
immediately preceded by an odd number of right square brackets to trigger the
bug. I tracked the problem down to this section of the scanData method in
class com.sun.org.apache.xerces.internal.impl.XMLEntityScanner:
// iterate over buffer looking for delimiter
OUTER: while (fCurrentEntity.position < fCurrentEntity.count) {
c = fCurrentEntity.ch[fCurrentEntity.position++];
if (c == charAt0) {
// looks like we just hit the delimiter
int delimOffset = fCurrentEntity.position - 1;
for (int i = 1; i < delimLen; i++) {
if (fCurrentEntity.position == fCurrentEntity.count) {
fCurrentEntity.position -= i;
break OUTER;
}
c = fCurrentEntity.ch[fCurrentEntity.position++];
if (delimiter.charAt(i) != c) {
fCurrentEntity.position--; // S/B position -= i;
break;
}
}
if (fCurrentEntity.position == delimOffset + delimLen) {
found = true;
break;
}
}
Within the for loop, the parse position can be advanced an any number of
places before a non-match is detected by the second if statement. When that
happens, the position should be backed off by the amount of the loop counter,
as is done in the first if statement. Instead it's arbitrarily backed off by
one place, which can leave the parse position out of sync with the data.
Before b46, that never happened because the method was only used to find two-
character delimiters: "--", "?>", and "]]". But now the scanCDATASection
method in class
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl is
passing it the full CDEnd sequence, "]]>".
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Run the sample code against the supplied XML file.
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
<test>
<test01>blah]</test01>
<test02>blah</test02>
</test>
ACTUAL -
<test>
<test01>blah]</test01>
<test02>blah
ERROR MESSAGES/STACK TRACES THAT OCCUR :
org.xml.sax.SAXParseException: The element type "test01" must be terminated by the matching end-tag "</test01>".
at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:236)
at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(ErrorHandlerWrapper.java:215)
at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:388)
at com.sun.org.apache.xerces.internal.impl.XMLScanner.reportFatalError(XMLScanner.java:1419)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanEndElement(XMLDocumentFragmentScannerImpl.java:1763)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2944)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:664)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:524)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:844)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:774)
at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:148)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1255)
at javax.xml.parsers.SAXParser.parse(SAXParser.java:376)
at javax.xml.parsers.SAXParser.parse(SAXParser.java:312)
at Test.main(Test.java:21)
REPRODUCIBILITY :
This bug can be reproduced always.
---------- BEGIN SOURCE ----------
==== Test.java ===============================================================
import java.io.*;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import org.xml.sax.*;
import org.xml.sax.helpers.DefaultHandler;
public class Test extends DefaultHandler
{
public static void main(String[] args)
{
DefaultHandler handler = new Test();
SAXParserFactory factory = SAXParserFactory.newInstance();
try
{
out = new OutputStreamWriter(System.out, "UTF8");
SAXParser saxParser = factory.newSAXParser();
saxParser.parse(new File("test.xml"), handler);
}
catch (Throwable t)
{
System.out.println();
System.out.println();
t.printStackTrace();
}
System.exit(0);
}
private static Writer out;
//===========================================================
// SAX DocumentHandler methods
//===========================================================
public void endDocument() throws SAXException
{
try
{
nl();
out.flush();
}
catch (IOException e)
{
throw new SAXException("I/O error", e);
}
}
public void startElement(String namespaceURI, String lName,
String qName, Attributes attrs)
throws SAXException
{
emit("<" + qName + ">");
}
public void endElement(String namespaceURI, String sName,
String qName)
throws SAXException
{
emit("</" + qName + ">");
}
public void characters(char buf[], int offset, int len)
throws SAXException
{
String s = new String(buf, offset, len);
emit(s);
}
private void emit(String s) throws SAXException
{
try
{
out.write(s);
out.flush();
}
catch (IOException e)
{
throw new SAXException("I/O error", e);
}
}
private void nl() throws SAXException
{
String lineEnd = System.getProperty("line.separator");
try
{
out.write(lineEnd);
}
catch (IOException e)
{
throw new SAXException("I/O error", e);
}
}
}
==== test.xml ===============================================================
<?xml version='1.0' encoding='utf-8'?>
<test>
<test01><![CDATA[blah]]]></test01>
<test02><![CDATA[blah]]></test02>
</test>
---------- END SOURCE ----------
Release Regression From : 5.0
The above release value was the last known release where this
bug was known to work. Since then there has been a regression.
Release Regression From : 5.0
The above release value was the last known release where this
bug was known to work. Since then there has been a regression.
Release Regression From : tiger-rc
The above release value was the last known release where this
bug was known to work. Since then there has been a regression.
Release Regression From : dolphin
The above release value was the last known release where this
bug was known to work. Since then there has been a regression.
Release Regression From : dolphin
The above release value was the last known release where this
bug was known to work. Since then there has been a regression.
java version "1.6.0-ea"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.6.0-ea-b47)
Java HotSpot(TM) Client VM (build 1.6.0-ea-b47, mixed mode, sharing)
A DESCRIPTION OF THE PROBLEM :
As of Mustang b46, if the last character in a CDATA section is ']', the CDATA
end delimiter (CDEnd) is not recognized. More precisely, the CDEnd has to be
immediately preceded by an odd number of right square brackets to trigger the
bug. I tracked the problem down to this section of the scanData method in
class com.sun.org.apache.xerces.internal.impl.XMLEntityScanner:
// iterate over buffer looking for delimiter
OUTER: while (fCurrentEntity.position < fCurrentEntity.count) {
c = fCurrentEntity.ch[fCurrentEntity.position++];
if (c == charAt0) {
// looks like we just hit the delimiter
int delimOffset = fCurrentEntity.position - 1;
for (int i = 1; i < delimLen; i++) {
if (fCurrentEntity.position == fCurrentEntity.count) {
fCurrentEntity.position -= i;
break OUTER;
}
c = fCurrentEntity.ch[fCurrentEntity.position++];
if (delimiter.charAt(i) != c) {
fCurrentEntity.position--; // S/B position -= i;
break;
}
}
if (fCurrentEntity.position == delimOffset + delimLen) {
found = true;
break;
}
}
Within the for loop, the parse position can be advanced an any number of
places before a non-match is detected by the second if statement. When that
happens, the position should be backed off by the amount of the loop counter,
as is done in the first if statement. Instead it's arbitrarily backed off by
one place, which can leave the parse position out of sync with the data.
Before b46, that never happened because the method was only used to find two-
character delimiters: "--", "?>", and "]]". But now the scanCDATASection
method in class
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl is
passing it the full CDEnd sequence, "]]>".
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Run the sample code against the supplied XML file.
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
<test>
<test01>blah]</test01>
<test02>blah</test02>
</test>
ACTUAL -
<test>
<test01>blah]</test01>
<test02>blah
ERROR MESSAGES/STACK TRACES THAT OCCUR :
org.xml.sax.SAXParseException: The element type "test01" must be terminated by the matching end-tag "</test01>".
at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:236)
at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(ErrorHandlerWrapper.java:215)
at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:388)
at com.sun.org.apache.xerces.internal.impl.XMLScanner.reportFatalError(XMLScanner.java:1419)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanEndElement(XMLDocumentFragmentScannerImpl.java:1763)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2944)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:664)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:524)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:844)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:774)
at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:148)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1255)
at javax.xml.parsers.SAXParser.parse(SAXParser.java:376)
at javax.xml.parsers.SAXParser.parse(SAXParser.java:312)
at Test.main(Test.java:21)
REPRODUCIBILITY :
This bug can be reproduced always.
---------- BEGIN SOURCE ----------
==== Test.java ===============================================================
import java.io.*;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import org.xml.sax.*;
import org.xml.sax.helpers.DefaultHandler;
public class Test extends DefaultHandler
{
public static void main(String[] args)
{
DefaultHandler handler = new Test();
SAXParserFactory factory = SAXParserFactory.newInstance();
try
{
out = new OutputStreamWriter(System.out, "UTF8");
SAXParser saxParser = factory.newSAXParser();
saxParser.parse(new File("test.xml"), handler);
}
catch (Throwable t)
{
System.out.println();
System.out.println();
t.printStackTrace();
}
System.exit(0);
}
private static Writer out;
//===========================================================
// SAX DocumentHandler methods
//===========================================================
public void endDocument() throws SAXException
{
try
{
nl();
out.flush();
}
catch (IOException e)
{
throw new SAXException("I/O error", e);
}
}
public void startElement(String namespaceURI, String lName,
String qName, Attributes attrs)
throws SAXException
{
emit("<" + qName + ">");
}
public void endElement(String namespaceURI, String sName,
String qName)
throws SAXException
{
emit("</" + qName + ">");
}
public void characters(char buf[], int offset, int len)
throws SAXException
{
String s = new String(buf, offset, len);
emit(s);
}
private void emit(String s) throws SAXException
{
try
{
out.write(s);
out.flush();
}
catch (IOException e)
{
throw new SAXException("I/O error", e);
}
}
private void nl() throws SAXException
{
String lineEnd = System.getProperty("line.separator");
try
{
out.write(lineEnd);
}
catch (IOException e)
{
throw new SAXException("I/O error", e);
}
}
}
==== test.xml ===============================================================
<?xml version='1.0' encoding='utf-8'?>
<test>
<test01><![CDATA[blah]]]></test01>
<test02><![CDATA[blah]]></test02>
</test>
---------- END SOURCE ----------
Release Regression From : 5.0
The above release value was the last known release where this
bug was known to work. Since then there has been a regression.
Release Regression From : 5.0
The above release value was the last known release where this
bug was known to work. Since then there has been a regression.
Release Regression From : tiger-rc
The above release value was the last known release where this
bug was known to work. Since then there has been a regression.
Release Regression From : dolphin
The above release value was the last known release where this
bug was known to work. Since then there has been a regression.
Release Regression From : dolphin
The above release value was the last known release where this
bug was known to work. Since then there has been a regression.