Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8306056

Add a built-in Catalog to JDK XML module

XMLWordPrintable

    • Icon: CSR CSR
    • Resolution: Approved
    • Icon: P4 P4
    • 22
    • xml
    • None
    • minimal
    • The risk is minimal. There's no change to the existing API and behavior of the JDK.
    • Java API, System or security property
    • SE

      Summary

      Add a JDK built-in Catalog that hosts DTDs defined by the Java Platform. A CatalogResolver created with the built-in Catalog functions as the JDK's default resolver for external resource references. Add a system property for specifying the action the default resolver should take when it is unable to resolver a resource.

      Problem

      XML documents commonly contain references to external resources such as DTDs (Document Type Definitions) and XSDs (XML Schema Definitions). When the JAXP processors encounter such references, their default behavior is to try making a network connection to retrieve the external documents, which can be a big burden on the hosting server as evidenced before. The process could also be disruptive if/when the hosting servers become unstable or unreachable (e.g. discontinued servers).

      A more serious concern over reading external resources is security as XML External Entity (XXE) Injection Attack has been one of the major sources of vulnerability. The JAXP library provides measures such as the External Access Properties (ACCESS_EXTERNAL_DTD, ACCESS_EXTERNAL_SCHEMA, and ACCESS_EXTERNAL_STYLESHEET) for applications to control whether external references can be accessed. These restrictions however, can be impractical as they regulate the access by protocols (i.e. file, http) so that once a restriction is set, all references by the protocol are rejected.

      Solution

      The proposed solution aims to provide a mechanism for users to regulate how external references are processed.

      The solution has two parts:

      a. A built-in catalog that hosts the DTDs defined by the Java Platform
      
      b. A resolve property, jdk.xml.jdkcatalog.resolve, for determining the action the built-in CatalogResolver may take when unable to resolve a resource.

      In this solution, the JDK creates a CatalogResolver based on the built-in catalog when needed. This CatalogResolver functions as the default external resource resolver. When no user-defined resolvers are registered on it, a JDK XML processor will fall back to the default CatalogResolver to attempt to resolve an external reference before making a connection to fetch it. The fall-back also takes place if a user-defined resolver exists but allows the process to continue when unable to resolve the resource.

      If the default CatalogResolver is unable to locate a resource, it will signal the XML processors to continue processing, or skip the resource, or throw a CatalogException. The action it takes is configured with the jdk.xml.jdkcatalog.resolve property.

      Specification

      Add an implementation specific property

      Property Name: jdk.xml.jdkcatalog.resolve

      System Property: jdk.xml.jdkcatalog.resolve

      Description: instructs the JDK default CatalogResolver to act in accordance with the setting of this property when unable to resolve an external reference with the built-in Catalog. The options are:

      continue -- Indicates that the processing should continue
      
      ignore -- Indicates that the reference is skipped
      
      strict -- Indicates that the resolver should throw a CatalogException

      This change is documented in the module-summary. A new section JDK built-in Catalog and property jdk.xml.jdkcatalog.resolve are added. Below is the diff:

      + * <h2 id="JDKCATALOG">JDK built-in Catalog</h2>
      + * The JDK has a built-in catalog that hosts the following DTDs defined by the Java Platform:
      + * <ul>
      + * <li>DTD for {@link java.util.prefs.Preferences java.util.prefs.Preferences}, preferences.dtd</li>
      + * <li>DTD for {@link java.util.Properties java.util.Properties}, properties.dtd</li>
      + * </ul>
      + * <p>
      + * The catalog is loaded once when the first JAXP processor factory is created.
      + *
      + * <h3 id="JC_PROCESS">External Resource Resolution Process with the built-in Catalog</h3>
      + * The JDK creates a {@link javax.xml.catalog.CatalogResolver CatalogResolver}
      + * with the built-in catalog when needed. This CatalogResolver is used as the
      + * default external resource resolver.
      + * <p>
      + * XML processors may use resolvers (such as {@link org.xml.sax.EntityResolver EntityResolver},
      + * {@link javax.xml.stream.XMLResolver XMLResolver}, and {@link javax.xml.catalog.CatalogResolver CatalogResolver})
      + * to handle external references. In the absence of the user-defined resolvers,
      + * the JDK XML processors fall back to the default CatalogResolver to attempt to
      + * find a resolution before making a connection to fetch the resources. The fall-back
      + * also takes place if a user-defined resolver exists but allows the process to
      + * continue when unable to resolve the resource.
      + * <p>
      + * If the default CatalogResolver is unable to locate a resource, it may signal
      + * the XML processors to continue processing, or skip the resource, or
      + * throw a CatalogException. The behavior is configured with the
      + * <a href="#JDKCATALOG_RESOLVE">{@code jdk.xml.jdkcatalog.resolve}</a> property.
      + *
      + * <tr>
      + * <td id="JDKCATALOG_RESOLVE">{@systemProperty jdk.xml.jdkcatalog.resolve}</td>
      + * <td>Instructs the JDK default CatalogResolver to act in accordance with the setting
      + * of this property when unable to resolve an external reference with the built-in Catalog.
      + * The options are:
      + * <ul>
      + * <li><p>
      + * {@code continue} -- Indicates that the processing should continue
      + * </li>
      + * <li><p>
      + * {@code ignore} -- Indicates that the reference is skipped
      + * </li>
      + * <li><p>
      + * {@code strict} -- Indicates that the resolver should throw a CatalogException
      + * </li>
      + * </ul>
      + * </td>
      + * <td style="text-align:center">String</td>
      + * <td>
      + * {@code continue, ignore, and strict}. Values are case-insensitive.
      + * </td>
      + * <td style="text-align:center">continue</td>
      + * <td style="text-align:center">No</td>List item
      + * <td style="text-align:center">Yes</td>
      + * <td style="text-align:center">
      + *     <a href="#DOM">DOM</a><br>
      + *     <a href="#SAX">SAX</a><br>
      + *     <a href="#StAX">StAX</a><br>
      + *     <a href="#Validation">Validation</a><br>
      + *     <a href="#Transform">Transform</a>
      + * </td>
      + * <td style="text-align:center"><a href="#Processor">Method 1</a></td>
      + * <td style="text-align:center">22</td>
      + * </tr>
      

            joehw Joe Wang
            joehw Joe Wang
            Alan Bateman, Lance Andersen
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: