Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-6520131

Transformer produces invalid XML

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Not an Issue
    • Icon: P4 P4
    • None
    • 6
    • xml

      FULL PRODUCT VERSION :
      java version "1.6.0"
      Java(TM) SE Runtime Environment (build 1.6.0-b105)
      Java HotSpot(TM) Client VM (build 1.6.0-b105, mixed mode, sharing)


      ADDITIONAL OS VERSION INFORMATION :
      Linux dhcppc0 2.6.18-3-686 #1 SMP Mon Dec 4 16:41:14 UTC 2006 i686 GNU/Linux

      A DESCRIPTION OF THE PROBLEM :
      Short background info: I am the author of Pauker (http://pauker.sourceforge.net/). It's a flashcard program that uses XML to store the lesson files. I got a bugreport from a user who pasted text from the clipboard to the flashcards. Result was that he could no longer open the lesson. After evaluating the problem I found out that the problem is that the Transformer produces invalid XML when it gets fed some "special" characters. This results in the DocumentBuilder not being able to parse the files anymore.

      STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
      Compile and execute the attached test case.
      Notice in the program output that the transformer produces "&#1", which is an invalid XML character.
      Uncomment the parsing part of the test case and notice how DocumentBuilder rightfully throws a SAXParseException because of the invalid XML character produced by Transformer.

      EXPECTED VERSUS ACTUAL BEHAVIOR :
      EXPECTED -
      I expect to valid XML to be generated.
      ACTUAL -
      Invalid XML was generated.

      REPRODUCIBILITY :
      This bug can be reproduced always.

      ---------- BEGIN SOURCE ----------
      import java.io.ByteArrayOutputStream;
      import java.io.StringReader;
      import javax.xml.parsers.*;
      import javax.xml.transform.*;
      import javax.xml.transform.dom.DOMSource;
      import javax.xml.transform.stream.StreamResult;
      import org.w3c.dom.*;
      import org.xml.sax.InputSource;

      public class XMLTest {
          
          public static void main(String[] args) {
              
              String string = new String("Via copy & paste you can insert \"\u0001\" into a JTextArea.");
              
              try {
                  // create document
                  DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
                  DocumentBuilder documentBuilder = dbf.newDocumentBuilder();
                  Document document = documentBuilder.newDocument();
                  
                  // add text element
                  Element textElement = document.createElement("Text");
                  Text text = document.createTextNode(string);
                  textElement.appendChild(text);
                  document.appendChild(textElement);
                  
                  // transform to XML
                  TransformerFactory tf = TransformerFactory.newInstance();
                  Transformer transformer = tf.newTransformer();
                  transformer.setOutputProperty(OutputKeys.INDENT, "yes");
                  DOMSource source = new DOMSource(document);
                  ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
                  StreamResult result = new StreamResult(outputStream);
                  transformer.transform(source, result);
                  String xmlString = outputStream.toString();
                  System.out.println("xmlString:\n" + xmlString);
                  
                  // try parsing
      // StringReader reader = new StringReader(xmlString);
      // InputSource inputSource = new InputSource(reader);
      // Document document2 = documentBuilder.parse(inputSource);
                  
              } catch (Exception e) {
                  e.printStackTrace();
              }
          }
      }
      ---------- END SOURCE ----------

            spericas Santiago Pericasgeertsen
            ndcosta Nelson Dcosta (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated:
              Resolved:
              Imported:
              Indexed: