Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-6760982

JVM 1.6 Xerces Parser Corrupts Attribute Value

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Duplicate
    • Icon: P4 P4
    • None
    • 6
    • xml

      FULL PRODUCT VERSION :
      [root@localhost Download]# /usr/local/jdk1.6.0_06/jre/bin/java -version
      java version "1.6.0_06"
      Java(TM) SE Runtime Environment (build 1.6.0_06-b02)
      Java HotSpot(TM) Client VM (build 10.0-b22, mixed mode, sharing)
      [root@localhost Download]# /usr/local/jdk1.6.0_10/jre/bin/java -version
      java version "1.6.0_10-beta"
      Java(TM) SE Runtime Environment (build 1.6.0_10-beta-b14)
      Java HotSpot(TM) Client VM (build 11.0-b11, mixed mode, sharing)


      ADDITIONAL OS VERSION INFORMATION :
      Linux localhost.localdomain 2.6.23.8-63.fc8 #1 SMP Wed Nov 21 18:51:08 EST 2007 i686 i686 i386 GNU/Linux


      A DESCRIPTION OF THE PROBLEM :
      Problem
      -------
      Sun's JVM 1.6 (6u6-linux and 6u10-beta-linux) and its associated
      Xerces parser corrupts an attribute value of the enclosed XML
      document, PrintXML.xml (compiled with '-source 1.4' for testing
      convenience), once node.getChildNodes () is invoked (see attached
      code, PrintXML.java). Instead of producing

       Test (class com.sun.org.apache.xerces.internal.dom.DeferredElementImpl):
           mytest (class com.sun.org.apache.xerces.internal.dom.DeferredElementImpl):
               <attribs> (class com.sun.org.apache.xerces.internal.dom.AttributeMap):
                   Y = []
                   Z = ZZ[]
                   a = []
                   b = []
                   c = []
                   d = []
                   e = []
                   f = []

      it produces

       Test (class com.sun.org.apache.xerces.internal.dom.DeferredElementImpl):
           mytest (class com.sun.org.apache.xerces.internal.dom.DeferredElementImpl):
               <attribs> (class com.sun.org.apache.xerces.internal.dom.AttributeMap):
                   Y = ZZ <-- CORRUPTED VALUE
                   Z = ZZ[]
                   a = []
                   b = []
                   c = []
                   d = []
                   e = []
                   f = []

      This corruption does _not_ occur with Sun's JVM 1.4 or 1.5, or when
      Xerces >= 1.4.4 is explicitly included. (I surmise that Sun's JVM 1.6
      introduces a bug that causes a latent Xerces < 1.4.4 bug to become
      manifest; however, I don't know what version of Xerces Sun
      distributes.)


      Test Case Variations
      ---- ---- ----------
      The number of attributes matters but their names do not; if attributes
      are added or removed (I've only varied the count by +/- 1 and 2)
      before 'Y', the bug is not manifested. However, adding variations on
      the 'Y'/'Z' attribute couplet _after_ the 'Z' attribute will
      (repeatedly) manifest the bug.

      The attribute values must have '[' and ']', but it does not matter
      what or how may characters precede or follow these characters. If any
      attribute prior to 'Y' does not contain both left and right bracket,
      the bug is not manifested.

      The order of the attributes matters -- if 'Z' precedes 'Y', the bug is
      not manifested.


      User Workaround
      ---- ----------
      Distribute Xerces >= 1.4.4 jar files with application.


      JVM Workaround
      --- ----------
      Update XML parser to a more recent version of Xerces and fix
      underlying JVM bug.


      STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
      Compile PrintXML.java (see below) with '-source 1.4':

          javac -source 1.4 PrintXML.java

      Run with

           java -cp . PrintXML PrintXML.xml


      EXPECTED VERSUS ACTUAL BEHAVIOR :
      EXPECTED -
       Test (class com.sun.org.apache.xerces.internal.dom.DeferredElementImpl):
           mytest (class com.sun.org.apache.xerces.internal.dom.DeferredElementImpl):
               <attribs> (class com.sun.org.apache.xerces.internal.dom.AttributeMap):
                   Y = []
                   Z = ZZ[]
                   a = []
                   b = []
                   c = []
                   d = []
                   e = []
                   f = []

      ACTUAL -
       Test (class com.sun.org.apache.xerces.internal.dom.DeferredElementImpl):
           mytest (class com.sun.org.apache.xerces.internal.dom.DeferredElementImpl):
               <attribs> (class com.sun.org.apache.xerces.internal.dom.AttributeMap):
                   Y = ZZ
                   Z = ZZ[]
                   a = []
                   b = []
                   c = []
                   d = []
                   e = []
                   f = []


      ERROR MESSAGES/STACK TRACES THAT OCCUR :
      No error message; the artifact is a corrupted XML attribute value.


      REPRODUCIBILITY :
      This bug can be reproduced always.

      ---------- BEGIN SOURCE ----------
      // PrintXML.java

      import java.io.File;
      import java.io.FileReader;
      import java.io.Reader;

      import javax.xml.parsers.DocumentBuilder;
      import javax.xml.parsers.DocumentBuilderFactory;

      import org.w3c.dom.Document;
      import org.w3c.dom.NamedNodeMap;
      import org.w3c.dom.Node;
      import org.w3c.dom.NodeList;
      import org.xml.sax.InputSource;


      public class PrintXML {

      private PrintXML () {
      }

      private static void _Flush ()
      {
      System.out.flush ();
      System.err.flush ();
      }

      private static void _Println (String str, int level)
      {
      for (int i = 0; i < level; i++)
      System.out.print (" ");

      System.out.println (str);
      System.out.flush ();
      }

      private static void _ErrPrintln (String aStr)
      {
      System.out.flush ();
      System.err.println (aStr);
      System.err.flush ();
      }

      private static Document _Parse (File f)
      throws Exception
      {
      FileReader rd = new FileReader (f);
      Document doc = _Parse (rd);

      rd.close ();

      return doc;
      }

      private static Document _Parse (Reader src)
      throws Exception
      {
      DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance ();

      dbf.setValidating (false); // to improve performance

      DocumentBuilder xmlParser = dbf.newDocumentBuilder ();
      InputSource is = new InputSource (src);

      return xmlParser.parse (is);
      }

      private static void _PrintAttributes (Node n, int level)
      {
      NamedNodeMap nnmap = n.getAttributes ();

      if (nnmap != null && nnmap.getLength () > 0) {
      _Println ("<attribs> (" + nnmap.getClass () + "):", level + 1);

      for (int i = 0; i < nnmap.getLength (); i++) {
      Node an = nnmap.item (i);

      String nameStr = an.getNodeName ();
      String valueStr = an.getNodeValue ();

      if (valueStr != "")
      nameStr += " = " + valueStr;

      _Println (nameStr, level + 2);
      }
      }
      }

      private static void _ProcessChildren (Node n, int level)
      throws Exception
      {
      NodeList nlist = n.getChildNodes ();

      if (nlist != null)
      for (int i = 0; i < nlist.getLength (); i++)
      _ProcessNode (nlist.item (i), level + 1);
      }

      private static void _ProcessNode (Node n, int level)
      throws Exception
      {
      n.getAttributes ();
      n.getChildNodes ();

      // At this point, for JVM 1.6 and Xerces <= 1.3.1, Test-XML.xml::mytest:Y's attribute is (already) bad.

      switch (n.getNodeType()) {

      case Node.TEXT_NODE:
      String str = n.getNodeValue ().trim ();

      /*...Only print non-empty strings...*/
      if (str.length () > 0) {
      String valStr = n.getNodeValue ();

      _Println (valStr, level);
      }
      break;

      case Node.COMMENT_NODE:
      break;

      default: {
      String nodeNameStr = n.getNodeName ();

      _Println (nodeNameStr + " (" + n.getClass () + "):", level);

      /*...Print children...*/
      _ProcessChildren (n, level);

      /*...Print optional node attributes...*/
      _PrintAttributes (n, level);
      }
      }
      }

      /**
      * @param args
      */
      public static void main (String[] args) {

      String xmlFile = null;

      /*...Process CLI arguments...*/
      for (int i = 0; i < args.length; i++) {
      String argStr = args[i].trim ();

      if (xmlFile == null)
      xmlFile = argStr;
      else
      _ErrPrintln ("Unknown argument: " + argStr);
      }

      if (xmlFile == null) {
      _ErrPrintln ("Error: missing <xml file>");
      }
      else {
      try {
      Document xmlDoc = _Parse (new File (xmlFile));
      Node node = xmlDoc.getDocumentElement ();

      _ProcessNode (node, 0);
      _Flush ();
      }
      catch (Exception e) {
      _ErrPrintln ("Exception: " + e.toString ());
      e.printStackTrace ();
      }
      }

      _Flush ();
      }

      }


      ////////////////////////// Test XML file: PrintXML.xml ///////////////////////////

      <?xml version="1.0" encoding="UTF-8"?>

      <Test>
        <mytest a= '[]'
                 b= '[]'
                 c= '[]'
                 d= '[]'
                 e= '[]'
                 f= '[]'
                 Y= '[]'
                 Z= 'ZZ[]'
        />
      </Test>

      ---------- END SOURCE ----------

      CUSTOMER SUBMITTED WORKAROUND :
      Distribute Xerces >= 1.4.4 jar files with application or use Sun JVM < 1.6.

      (Note that this bug does not give me confidence in JVM 1.6's reliability.)

            Unassigned Unassigned
            ryeung Roger Yeung (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated:
              Resolved:
              Imported:
              Indexed: