Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-6690015

XML Parse attributes with amp gt; in attribute value causes wrong order

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Duplicate
    • Icon: P3 P3
    • None
    • 6
    • xml

      FULL PRODUCT VERSION :
      :~$ java -version
      java version "1.6.0_03"
      Java(TM) SE Runtime Environment (build 1.6.0_03-b05)
      Java HotSpot(TM) Client VM (build 1.6.0_03-b05, mixed mode, sharing)

      ADDITIONAL OS VERSION INFORMATION :
      Windows XP service pack 2
      Linux <hostname> 2.6.22-14-generic #1 SMP Tue Feb 12 07:42:25 UTC 2008 i686 GNU/Linux

      A DESCRIPTION OF THE PROBLEM :
      Problem occurs dependent on at least two factors:

      1. The number of attributes in the parsed element
      2. The existence of allowed entities, eg. amp gt; (ampersand not actually written here)

      Similar (but not the same) bug found in bug database search, 6567432, but that was declared to be fixed for java 6 update 3, and I am using Java 6 update 5.
      ===================================================

      Problem:

      When an XML element is parsed, and that element has:
          1. enough attributes (my tests were using 16 attributes)
          2. attributes which values contain allowed entities, eg. amp gt;
      the retrieval of attributes results in:
          1. mixed up attribute name/ attribute value pairs
          2. sometimes attribute values merging with attribute names, resulting in a generally confused output.
          3. absolutely NO exception or error is ever thrown. Only wrong output is the symptom.

      This bug does NOT occur in java 1.4.2


      STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
      compile and run the provided test application (against the provided XML) with both java 1.4, then java 6 to compare the results (it is required to save the provided XML as a file, and change the filename in the example to point to this file).
      Java 1.4 results in correct output,
      Java 6 results in garbage.

      package astraia.test;

      import java.io.FileInputStream;

      import javax.xml.parsers.DocumentBuilderFactory;

      import org.w3c.dom.Document;
      import org.w3c.dom.Element;
      import org.w3c.dom.NamedNodeMap;
      import org.w3c.dom.Node;
      import org.w3c.dom.NodeList;
      import org.xml.sax.InputSource;
       
      public class Example
      {
          public static void main(String[] argv)
          {
      try
      {
      FileInputStream fis = new FileInputStream("/home/sean/Desktop/chris/lessNoInternat.xml");
       
      Document doc = DocumentBuilderFactory.newInstance()
      .newDocumentBuilder()
      .parse(new InputSource(fis));
      Element root = doc.getDocumentElement();
      NodeList textnodes = root.getElementsByTagName("text");
      int len = textnodes.getLength();
      int index = 0;
      int attindex = 0;
      int attrlen = 0;
      NamedNodeMap attrs = null;
       
      while (index<len)
      {
      Element te = (Element)textnodes.item(index);
      attrs = te.getAttributes();
      attrlen = attrs.getLength();
      attindex = 0;
      Node node = null;
       
      while (attindex<attrlen)
      {
      node = attrs.item(attindex);
      System.out.println("attr: "+node.getNodeName()+ " is shown holding value: " + node.getNodeValue());
      attindex++;
      }
      index++;
      System.out.println("-------------");
      }
      fis.close();
      }
      catch(Exception e)
      {
      System.out.println("we've had an exception, type "+ e);
      }
      }
      }

      xml file:

      <?xml version="1.0" encoding="UTF-8"?>
      <block>
      <lang>
      <text dna="8233" ro="hello, and i'll type some normal characters in (&gt;=1.5 mm) ro" it="here to make sure international characters don't play a part(&gt;=1.5mm) it" tr="make sure international characters don't play a part (&gt;=1.5 mm) tr" pt_br="make sure international characters don't play a part (&gt;=1,5 mm) pt_br" de="make sure international characters don't play a part (&gt;=1,5 mm) de" el="make sure international characters don't play a part (&gt;= 1.5 mm) el" zh_cn="make sure international characters don't play a part¿&gt;= 1.5 mm¿ zh_cn" pt="make sure international characters don't play a part (&gt;=1,5 mm) pt" bg="make sure international characters don't play a part (&gt;= 1.5 mm) bg" fr="make sure international characters don't play a part (&gt;= 1,5 mm) fr" en="make sure international characters don't play a part (&gt;= 1.5 mm) en" ru="make sure international characters don't play a part (&gt;=1.5 ¿¿) ru" es="make sure international characters don't play a part (&gt;=1.5 mm) es" ja="make sure international characters don't play a part¿&gt;=1.5mm¿ ja" nl="make sure international characters don't play a part (&gt;= 1,5 mm) nl" />
      </lang>
      </block>

      EXPECTED VERSUS ACTUAL BEHAVIOR :
      EXPECTED -

      The expected results are that when I iterate through the attributes and print out their name and values, they match what I see when i look at the xml file.
      Below, we see a run of the application using java 1.4
      As you can see, each line shows you on the left what attribute we are currently looking at,
      followed by the value it is shown holding.

      attr:<attribute-name>: is shown holding the value: <attribute-value>


      attr: dna is shown holding value: 8233
      attr: ro is shown holding value: hello, and i'll type some normal characters in (>=1.5 mm) ro
      attr: it is shown holding value: here to make sure international characters don't play a part(>=1.5mm) it
      attr: tr is shown holding value: make sure international characters don't play a part (>=1.5 mm) tr
      attr: pt_br is shown holding value: make sure international characters don't play a part (>=1,5 mm) pt_br
      attr: de is shown holding value: make sure international characters don't play a part (>=1,5 mm) de
      attr: el is shown holding value: make sure international characters don't play a part (>= 1.5 mm) el
      attr: zh_cn is shown holding value: make sure international characters don't play a part¿>= 1.5 mm¿ zh_cn
      attr: pt is shown holding value: make sure international characters don't play a part (>=1,5 mm) pt
      attr: bg is shown holding value: make sure international characters don't play a part (>= 1.5 mm) bg
      attr: fr is shown holding value: make sure international characters don't play a part (>= 1,5 mm) fr
      attr: en is shown holding value: make sure international characters don't play a part (>= 1.5 mm) en
      attr: ru is shown holding value: make sure international characters don't play a part (>=1.5 ¿¿) ru
      attr: es is shown holding value: make sure international characters don't play a part (>=1.5 mm) es
      attr: ja is shown holding value: make sure international characters don't play a part¿>=1.5mm¿ ja
      attr: nl is shown holding value: make sure international characters don't play a part (>= 1,5 mm) nl
      -------------

      ACTUAL -
      The actual results, as seen when this example program is run through Java 6, update 5
      shows the attribute names, and values a little garbled together sometimes, and mixed up, so that, for example, the value of attribute name 'en' no longer matches the original content, but the value of another attribute + the name of another attribute appended at the end.


      As you can see, each line shows you on the left what attribute we are currently looking at,
      followed by the value it is shown holding.

      attr:<attribute-name>: is shown holding the value: <attribute-value>


      attr: bg is shown holding value: make sure international characters don't play a part (>= 1,5 mm) fr
      attr: de is shown holding value: make sure international characters don't play a part (>=1,5 mm) de
      attr: dna is shown holding value: 8233
      attr: el is shown holding value: make sure international characters don't play a part (>= 1.5 mm) el
      attr: en is shown holding value: make sure international characters don't play a part (>=1.5 ¿¿) run
      attr: es is shown holding value: make sure international characters don't play a part¿>=1.5mm¿ jaes
      attr: fr is shown holding value: make sure international characters don't play a part (>= 1,5 mm) fr
      attr: it is shown holding value: here to make sure international characters don't play a part(>=1.5mm) it
      attr: ja is shown holding value: make sure international characters don't play a part¿>=1.5mm¿ ja
      attr: nl is shown holding value: make sure international characters don't play a part (>= 1,5 mm) nl
      attr: pt is shown holding value: make sure international characters don't play a part (>=1,5 mm) pt
      attr: pt_br is shown holding value: make sure international characters don't play a part (>=1,5 mm) pt_br
      attr: ro is shown holding value: hello, and i'll type some normal characters in (>=1.5 mm) ro
      attr: ru is shown holding value: make sure international characters don't play a part (>=1.5 ¿¿) ru
      attr: tr is shown holding value: make sure international characters don't play a part (>=1.5 mm) tr
      attr: zh_cn is shown holding value: make sure international characters don't play a part (>=1,5 mm) pt_cn
      -------------


      ERROR MESSAGES/STACK TRACES THAT OCCUR :
      No error message or exception

      REPRODUCIBILITY :
      This bug can be reproduced always.

      ---------- BEGIN SOURCE ----------
      package astraia.test;

      import java.io.FileInputStream;

      import javax.xml.parsers.DocumentBuilderFactory;

      import org.w3c.dom.Document;
      import org.w3c.dom.Element;
      import org.w3c.dom.NamedNodeMap;
      import org.w3c.dom.Node;
      import org.w3c.dom.NodeList;
      import org.xml.sax.InputSource;
       
      public class Example
      {
          public static void main(String[] argv)
          {
      try
      {
      FileInputStream fis = new FileInputStream("/home/sean/Desktop/chris/lessNoInternat.xml");
       
      Document doc = DocumentBuilderFactory.newInstance()
      .newDocumentBuilder()
      .parse(new InputSource(fis));
      Element root = doc.getDocumentElement();
      NodeList textnodes = root.getElementsByTagName("text");
      int len = textnodes.getLength();
      int index = 0;
      int attindex = 0;
      int attrlen = 0;
      NamedNodeMap attrs = null;
       
      while (index<len)
      {
      Element te = (Element)textnodes.item(index);
      attrs = te.getAttributes();
      attrlen = attrs.getLength();
      attindex = 0;
      Node node = null;
       
      while (attindex<attrlen)
      {
      node = attrs.item(attindex);
      System.out.println("attr: "+node.getNodeName()+ " is shown holding value: " + node.getNodeValue());
      attindex++;
      }
      index++;
      System.out.println("-------------");
      }
      fis.close();
      }
      catch(Exception e)
      {
      System.out.println("we've had an exception, type "+ e);
      }
      }
      }




      xml file:

      <?xml version="1.0" encoding="UTF-8"?>
      <block>
      <lang>
      <text dna="8233" ro="hello, and i'll type some normal characters in (&gt;=1.5 mm) ro" it="here to make sure international characters don't play a part(&gt;=1.5mm) it" tr="make sure international characters don't play a part (&gt;=1.5 mm) tr" pt_br="make sure international characters don't play a part (&gt;=1,5 mm) pt_br" de="make sure international characters don't play a part (&gt;=1,5 mm) de" el="make sure international characters don't play a part (&gt;= 1.5 mm) el" zh_cn="make sure international characters don't play a part¿&gt;= 1.5 mm¿ zh_cn" pt="make sure international characters don't play a part (&gt;=1,5 mm) pt" bg="make sure international characters don't play a part (&gt;= 1.5 mm) bg" fr="make sure international characters don't play a part (&gt;= 1,5 mm) fr" en="make sure international characters don't play a part (&gt;= 1.5 mm) en" ru="make sure international characters don't play a part (&gt;=1.5 ¿¿) ru" es="make sure international characters don't play a part (&gt;=1.5 mm) es" ja="make sure international characters don't play a part¿&gt;=1.5mm¿ ja" nl="make sure international characters don't play a part (&gt;= 1,5 mm) nl" />
      </lang>
      </block>
      ---------- END SOURCE ----------

      CUSTOMER SUBMITTED WORKAROUND :
      no workaround known

      Release Regression From : 5.0
      The above release value was the last known release where this
      bug was not reproducible. Since then there has been a regression.

            joehw Joe Wang
            ndcosta Nelson Dcosta (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated:
              Resolved:
              Imported:
              Indexed: