Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8029955

AIOB in XMLEntityScanner.scanLiteral upon parsing literals with > 100 LF chars

XMLWordPrintable

    • 6
    • b122
    • Verified

        FULL PRODUCT VERSION :
        Applicable to all versions I've tried, including latest tip from mercurial.

        ADDITIONAL OS VERSION INFORMATION :
        Applicable to all.

        A DESCRIPTION OF THE PROBLEM :
        There is a hardcoded buffer for parsing literals in XMLEntityScanner.

        int [] whiteSpaceLookup = new int[100];

        And there is later a loop that doesn't check if this buffer if overrun:

                    for ( i = offset; i < fCurrentEntity.position; i++) {
                        fCurrentEntity.ch[i] = '
        ';
                        whiteSpaceLookup[whiteSpaceLen++]=i;
                    }

        Which leads to easy-to-hit exceptions from the parser (odd XML, true, but valid). Reproducible code sample:

        import java.io.StringReader;

        import org.xml.sax.InputSource;
        import org.xml.sax.XMLReader;
        import org.xml.sax.helpers.XMLReaderFactory;

        public class Foo {
          public static void main(String[] args) throws Exception {
            StringBuilder builder = new StringBuilder();
            builder.append("<root attr=\"");
            for (int i = 0; i < 200; i++) {
              builder.append("
        ");
            }
            builder.append("foo.");
            builder.append("\" />");
            final XMLReader reader = XMLReaderFactory.createXMLReader();
            System.out.println(reader.getClass().getName());
            reader.parse(new InputSource(new StringReader(builder.toString())));
          }
        }


        ADDITIONAL REGRESSION INFORMATION:
        This is a long-standing issue. It's been reported to us back in 2011, but at the time I didn't inspect it closely (shame on me).

        STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
        Parse an XML with an attribute containing more than 100 LF chars.

        import java.io.StringReader;

        import org.xml.sax.InputSource;
        import org.xml.sax.XMLReader;
        import org.xml.sax.helpers.XMLReaderFactory;

        public class Foo {
          public static void main(String[] args) throws Exception {
            StringBuilder builder = new StringBuilder();
            builder.append("<root attr=\"");
            for (int i = 0; i < 200; i++) {
              builder.append("
        ");
            }
            builder.append("foo.");
            builder.append("\" />");
            final XMLReader reader = XMLReaderFactory.createXMLReader();
            System.out.println(reader.getClass().getName());
            reader.parse(new InputSource(new StringReader(builder.toString())));
          }
        }


        EXPECTED VERSUS ACTUAL BEHAVIOR :
        EXPECTED -
        Should parse.
        ACTUAL -
        Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 100
        at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.scanLiteral(XMLEntityScanner.java:1145)
        at com.sun.org.apache.xerces.internal.impl.XMLScanner.scanAttributeValue(XMLScanner.java:948)
        at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanAttribute(XMLNSDocumentScannerImpl.java:436)
        at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanStartElement(XMLNSDocumentScannerImpl.java:253)
        at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl$NSContentDriver.scanRootElementHook(XMLNSDocumentScannerImpl.java:602)
        at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:3116)
        at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(XMLDocumentScannerImpl.java:880)
        at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:606)
        at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:116)
        at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:511)
        at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:846)
        at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:775)
        at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:123)
        at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1210)


        REPRODUCIBILITY :
        This bug can be reproduced always.

        ---------- BEGIN SOURCE ----------
        Parse an XML with an attribute containing more than 100 LF chars.

        import java.io.StringReader;

        import org.xml.sax.InputSource;
        import org.xml.sax.XMLReader;
        import org.xml.sax.helpers.XMLReaderFactory;

        public class Foo {
          public static void main(String[] args) throws Exception {
            StringBuilder builder = new StringBuilder();
            builder.append("<root attr=\"");
            for (int i = 0; i < 200; i++) {
              builder.append("
        ");
            }
            builder.append("foo.");
            builder.append("\" />");
            final XMLReader reader = XMLReaderFactory.createXMLReader();
            System.out.println(reader.getClass().getName());
            reader.parse(new InputSource(new StringReader(builder.toString())));
          }
        }

        ---------- END SOURCE ----------

        CUSTOMER SUBMITTED WORKAROUND :
        Rewrite the fixed-width buffer routine.

              joehw Joe Wang
              asaha Abhijit Saha
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

                Created:
                Updated:
                Resolved: