Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-6313289

REGRESSION: CDATA end delimiters are not parsed correctly

XMLWordPrintable

    • 6.0
    • x86
    • windows_xp

      FULL PRODUCT VERSION :
      java version "1.6.0-ea"
      Java(TM) 2 Runtime Environment, Standard Edition (build 1.6.0-ea-b47)
      Java HotSpot(TM) Client VM (build 1.6.0-ea-b47, mixed mode, sharing)

      A DESCRIPTION OF THE PROBLEM :
      As of Mustang b46, if the last character in a CDATA section is ']', the CDATA
      end delimiter (CDEnd) is not recognized. More precisely, the CDEnd has to be
      immediately preceded by an odd number of right square brackets to trigger the
      bug. I tracked the problem down to this section of the scanData method in
      class com.sun.org.apache.xerces.internal.impl.XMLEntityScanner:

              // iterate over buffer looking for delimiter
              OUTER: while (fCurrentEntity.position < fCurrentEntity.count) {
                  c = fCurrentEntity.ch[fCurrentEntity.position++];
                  if (c == charAt0) {
                      // looks like we just hit the delimiter
                      int delimOffset = fCurrentEntity.position - 1;
                      for (int i = 1; i < delimLen; i++) {
                          if (fCurrentEntity.position == fCurrentEntity.count) {
                              fCurrentEntity.position -= i;
                              break OUTER;
                          }
                          c = fCurrentEntity.ch[fCurrentEntity.position++];
                          if (delimiter.charAt(i) != c) {
                              fCurrentEntity.position--; // S/B position -= i;
                              break;
                          }
                      }
                      if (fCurrentEntity.position == delimOffset + delimLen) {
                          found = true;
                          break;
                      }
                  }

      Within the for loop, the parse position can be advanced an any number of
      places before a non-match is detected by the second if statement. When that
      happens, the position should be backed off by the amount of the loop counter,
      as is done in the first if statement. Instead it's arbitrarily backed off by
      one place, which can leave the parse position out of sync with the data.
      Before b46, that never happened because the method was only used to find two-
      character delimiters: "--", "?>", and "]]". But now the scanCDATASection
      method in class
      com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl is
      passing it the full CDEnd sequence, "]]>".


      STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
      Run the sample code against the supplied XML file.

      EXPECTED VERSUS ACTUAL BEHAVIOR :
      EXPECTED -
      <test>

          <test01>blah]</test01>

          <test02>blah</test02>

      </test>
      ACTUAL -
      <test>

          <test01>blah]</test01>

          <test02>blah

      ERROR MESSAGES/STACK TRACES THAT OCCUR :
      org.xml.sax.SAXParseException: The element type "test01" must be terminated by the matching end-tag "</test01>".
              at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:236)
              at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(ErrorHandlerWrapper.java:215)
              at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:388)
              at com.sun.org.apache.xerces.internal.impl.XMLScanner.reportFatalError(XMLScanner.java:1419)

              at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanEndElement(XMLDocumentFragmentScannerImpl.java:1763)
              at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2944)
              at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:664)
              at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:524)
              at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:844)
              at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:774)
              at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:148)
              at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1255)
              at javax.xml.parsers.SAXParser.parse(SAXParser.java:376)
              at javax.xml.parsers.SAXParser.parse(SAXParser.java:312)
              at Test.main(Test.java:21)

      REPRODUCIBILITY :
      This bug can be reproduced always.

      ---------- BEGIN SOURCE ----------
      ==== Test.java ===============================================================

      import java.io.*;

      import javax.xml.parsers.SAXParser;
      import javax.xml.parsers.SAXParserFactory;

      import org.xml.sax.*;
      import org.xml.sax.helpers.DefaultHandler;

      public class Test extends DefaultHandler
      {
        public static void main(String[] args)
        {
          DefaultHandler handler = new Test();
          SAXParserFactory factory = SAXParserFactory.newInstance();
          try
          {
            out = new OutputStreamWriter(System.out, "UTF8");

            SAXParser saxParser = factory.newSAXParser();
            saxParser.parse(new File("test.xml"), handler);
          }
          catch (Throwable t)
          {
            System.out.println();
            System.out.println();
            t.printStackTrace();
          }
          System.exit(0);
        }

        private static Writer out;

        //===========================================================
        // SAX DocumentHandler methods
        //===========================================================

        public void endDocument() throws SAXException
        {
          try
          {
            nl();
            out.flush();
          }
          catch (IOException e)
          {
            throw new SAXException("I/O error", e);
          }
        }

        public void startElement(String namespaceURI, String lName,
                                 String qName, Attributes attrs)
             throws SAXException
        {
          emit("<" + qName + ">");
        }

        public void endElement(String namespaceURI, String sName,
                               String qName)
             throws SAXException
        {
          emit("</" + qName + ">");
        }

        public void characters(char buf[], int offset, int len)
             throws SAXException
        {
          String s = new String(buf, offset, len);
          emit(s);
        }

        private void emit(String s) throws SAXException
        {
          try
          {
            out.write(s);
            out.flush();
          }
          catch (IOException e)
          {
            throw new SAXException("I/O error", e);
          }
        }

        private void nl() throws SAXException
        {
          String lineEnd = System.getProperty("line.separator");
          try
          {
            out.write(lineEnd);
          }
          catch (IOException e)
          {
            throw new SAXException("I/O error", e);
          }
        }
      }


      ==== test.xml ===============================================================

      <?xml version='1.0' encoding='utf-8'?>

      <test>

          <test01><![CDATA[blah]]]></test01>

          <test02><![CDATA[blah]]></test02>

      </test>

      ---------- END SOURCE ----------

      Release Regression From : 5.0
      The above release value was the last known release where this
      bug was known to work. Since then there has been a regression.

      Release Regression From : 5.0
      The above release value was the last known release where this
      bug was known to work. Since then there has been a regression.

      Release Regression From : tiger-rc
      The above release value was the last known release where this
      bug was known to work. Since then there has been a regression.

      Release Regression From : dolphin
      The above release value was the last known release where this
      bug was known to work. Since then there has been a regression.

      Release Regression From : dolphin
      The above release value was the last known release where this
      bug was known to work. Since then there has been a regression.

            sreddysunw Sunitha Reddy (Inactive)
            tbell Tim Bell
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated:
              Resolved:
              Imported:
              Indexed: