-
Bug
-
Resolution: Not an Issue
-
P4
-
None
-
6
-
x86
-
linux
FULL PRODUCT VERSION :
java version "1.6.0"
Java(TM) SE Runtime Environment (build 1.6.0-b105)
Java HotSpot(TM) Client VM (build 1.6.0-b105, mixed mode, sharing)
ADDITIONAL OS VERSION INFORMATION :
Linux dhcppc0 2.6.18-3-686 #1 SMP Mon Dec 4 16:41:14 UTC 2006 i686 GNU/Linux
A DESCRIPTION OF THE PROBLEM :
Short background info: I am the author of Pauker (http://pauker.sourceforge.net/). It's a flashcard program that uses XML to store the lesson files. I got a bugreport from a user who pasted text from the clipboard to the flashcards. Result was that he could no longer open the lesson. After evaluating the problem I found out that the problem is that the Transformer produces invalid XML when it gets fed some "special" characters. This results in the DocumentBuilder not being able to parse the files anymore.
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Compile and execute the attached test case.
Notice in the program output that the transformer produces "", which is an invalid XML character.
Uncomment the parsing part of the test case and notice how DocumentBuilder rightfully throws a SAXParseException because of the invalid XML character produced by Transformer.
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
I expect to valid XML to be generated.
ACTUAL -
Invalid XML was generated.
REPRODUCIBILITY :
This bug can be reproduced always.
---------- BEGIN SOURCE ----------
import java.io.ByteArrayOutputStream;
import java.io.StringReader;
import javax.xml.parsers.*;
import javax.xml.transform.*;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;
import org.w3c.dom.*;
import org.xml.sax.InputSource;
public class XMLTest {
public static void main(String[] args) {
String string = new String("Via copy & paste you can insert \"\u0001\" into a JTextArea.");
try {
// create document
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder documentBuilder = dbf.newDocumentBuilder();
Document document = documentBuilder.newDocument();
// add text element
Element textElement = document.createElement("Text");
Text text = document.createTextNode(string);
textElement.appendChild(text);
document.appendChild(textElement);
// transform to XML
TransformerFactory tf = TransformerFactory.newInstance();
Transformer transformer = tf.newTransformer();
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
DOMSource source = new DOMSource(document);
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
StreamResult result = new StreamResult(outputStream);
transformer.transform(source, result);
String xmlString = outputStream.toString();
System.out.println("xmlString:\n" + xmlString);
// try parsing
// StringReader reader = new StringReader(xmlString);
// InputSource inputSource = new InputSource(reader);
// Document document2 = documentBuilder.parse(inputSource);
} catch (Exception e) {
e.printStackTrace();
}
}
}
---------- END SOURCE ----------
java version "1.6.0"
Java(TM) SE Runtime Environment (build 1.6.0-b105)
Java HotSpot(TM) Client VM (build 1.6.0-b105, mixed mode, sharing)
ADDITIONAL OS VERSION INFORMATION :
Linux dhcppc0 2.6.18-3-686 #1 SMP Mon Dec 4 16:41:14 UTC 2006 i686 GNU/Linux
A DESCRIPTION OF THE PROBLEM :
Short background info: I am the author of Pauker (http://pauker.sourceforge.net/). It's a flashcard program that uses XML to store the lesson files. I got a bugreport from a user who pasted text from the clipboard to the flashcards. Result was that he could no longer open the lesson. After evaluating the problem I found out that the problem is that the Transformer produces invalid XML when it gets fed some "special" characters. This results in the DocumentBuilder not being able to parse the files anymore.
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Compile and execute the attached test case.
Notice in the program output that the transformer produces "", which is an invalid XML character.
Uncomment the parsing part of the test case and notice how DocumentBuilder rightfully throws a SAXParseException because of the invalid XML character produced by Transformer.
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
I expect to valid XML to be generated.
ACTUAL -
Invalid XML was generated.
REPRODUCIBILITY :
This bug can be reproduced always.
---------- BEGIN SOURCE ----------
import java.io.ByteArrayOutputStream;
import java.io.StringReader;
import javax.xml.parsers.*;
import javax.xml.transform.*;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;
import org.w3c.dom.*;
import org.xml.sax.InputSource;
public class XMLTest {
public static void main(String[] args) {
String string = new String("Via copy & paste you can insert \"\u0001\" into a JTextArea.");
try {
// create document
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder documentBuilder = dbf.newDocumentBuilder();
Document document = documentBuilder.newDocument();
// add text element
Element textElement = document.createElement("Text");
Text text = document.createTextNode(string);
textElement.appendChild(text);
document.appendChild(textElement);
// transform to XML
TransformerFactory tf = TransformerFactory.newInstance();
Transformer transformer = tf.newTransformer();
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
DOMSource source = new DOMSource(document);
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
StreamResult result = new StreamResult(outputStream);
transformer.transform(source, result);
String xmlString = outputStream.toString();
System.out.println("xmlString:\n" + xmlString);
// try parsing
// StringReader reader = new StringReader(xmlString);
// InputSource inputSource = new InputSource(reader);
// Document document2 = documentBuilder.parse(inputSource);
} catch (Exception e) {
e.printStackTrace();
}
}
}
---------- END SOURCE ----------