Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8288549

Regression inside Xerces XMLEntityManager

XMLWordPrintable

      ADDITIONAL SYSTEM INFORMATION :
      Windows 10 - OpenJDK 64-Bit Server VM 11.0.15

      A DESCRIPTION OF THE PROBLEM :
      With the commit "https://github.com/openjdk/jdk/commit/7ee905a8a09c92b9534a440660d37c28cf5d797b" the following line

      expandedSystemId = expandSystemId(extLitSysId, extBaseSysId);

      was changed to

      expandedSystemId = expandSystemId(extLitSysId, extBaseSysId, fStrictURI);

      This added additional checks that finally lead to a "com.sun.org.apache.xerces.internal.util.URI$MalformedURIException" if the underlying DTD will be loaded from a file path with special characters, like german umlauts.
      This can easily happen, if the application stores its data (e.g. its DTD files) in the users app data folder and the user has a special character in its windows login name.
         
      Here is one of the additional checks:

              // Assume the URIs are well-formed. If it turns out they're not, try fixing them up.
              try {
                   return expandSystemIdStrictOff(systemId, baseSystemId);
              }
              catch (URI.MalformedURIException e) {
                  /** Xerces URI rejects unicode, try java.net.URI
                   * this is not ideal solution, but it covers known cases which either
                   * Xerces URI or java.net.URI can handle alone
                   * will file bug against java.net.URI
                   */
                  try {
                      return expandSystemIdStrictOff1(systemId, baseSystemId);
                  } catch (URISyntaxException ex) {
                      // continue on...
                  }
              }

      If the baseSystemId has special characters in its path, the call to expandSystemIdStrictOff and also the call to expandSystemIdStrictOff1 raises an URI.MalformedURIException.
      But in the second call an URI.MalformedURIException is not handled in the catch block and the exception will be delegated to the caller.

      Formerly a special character in a path didn't raise any exception and was handled correctly through the xerces pipeline.

      Here is the complete stacktrace:
      com.sun.org.apache.xerces.internal.util.URI$MalformedURIException: Opaque part contains invalid character: ö
      at com.sun.org.apache.xerces.internal.util.URI.initializePath(URI.java:1143) ~[?:?]
      at com.sun.org.apache.xerces.internal.util.URI.initialize(URI.java:583) ~[?:?]
      at com.sun.org.apache.xerces.internal.util.URI.<init>(URI.java:336) ~[?:?]
      at com.sun.org.apache.xerces.internal.util.URI.<init>(URI.java:299) ~[?:?]
      at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.expandSystemIdStrictOff1(XMLEntityManager.java:2393) ~[?:?]
      at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.expandSystemId(XMLEntityManager.java:2239) ~[?:?]
      at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startEntity(XMLEntityManager.java:1237) ~[?:?]
      at com.sun.org.apache.xerces.internal.impl.XMLDTDScannerImpl.startPE(XMLDTDScannerImpl.java:732) ~[?:?]
      at com.sun.org.apache.xerces.internal.impl.XMLDTDScannerImpl.skipSeparator(XMLDTDScannerImpl.java:2101) ~[?:?]
      at com.sun.org.apache.xerces.internal.impl.XMLDTDScannerImpl.scanDecls(XMLDTDScannerImpl.java:2064) ~[?:?]
      at com.sun.org.apache.xerces.internal.impl.XMLDTDScannerImpl.scanDTDExternalSubset(XMLDTDScannerImpl.java:299) ~[?:?]
      at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$DTDDriver.dispatch(XMLDocumentScannerImpl.java:1165) ~[?:?]
      at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$DTDDriver.next(XMLDocumentScannerImpl.java:1040) ~[?:?]
      at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(XMLDocumentScannerImpl.java:943) ~[?:?]
      at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:605) ~[?:?]
      at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:112) ~[?:?]
      at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:534) ~[?:?]
      at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:888) ~[?:?]
      at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:824) ~[?:?]
      at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141) ~[?:?]
      at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1216) ~[?:?]
      at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:635) ~[?:?]

      REGRESSION : Last worked in version 13


      FREQUENCY : always


            joehw Joe Wang
            webbuggrp Webbug Group
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated: