-
Bug
-
Resolution: Unresolved
-
P4
-
8u51, 9
-
x86
-
windows_8
FULL PRODUCT VERSION :
java version "1.8.0_51"
Java(TM) SE Runtime Environment (build 1.8.0_51-b16)
Java HotSpot(TM) 64-Bit Server VM (build 25.51-b03, mixed mode)
ADDITIONAL OS VERSION INFORMATION :
Microsoft Windows [Version 6.3.9600]
A DESCRIPTION OF THE PROBLEM :
Outputting a DOM document with CDATA sections that contain Windows line separators (CRLF) inserts unwanted CR characters when run on Windows, i.e. additional line breaks are inserted!
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Compile the program, execute it on Windows and you see the wrong output illustrated. Several unwanted CR characters have been inserted leading to additional line breaks.
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
The output on Windows should either be:
<newlines>
<unix><![CDATA[one<LF>
two<LF>
three]]></unix>
<windows><![CDATA[one<CR><LF>
two<CR><LF>
three]]></windows>
</newlines>
or
<newlines>
<unix><![CDATA[one<CR><LF>
two<CR><LF>
three]]></unix>
<windows><![CDATA[one<CR><LF>
two<CR><LF>
three]]></windows>
</newlines>
ACTUAL -
The output on Windows is:
<newlines>
<unix><![CDATA[one<CR><LF>
two<CR><LF>
three]]></unix>
<windows><![CDATA[one<CR><CR><LF>
two<CR><CR><LF>
three]]></windows>
</newlines>
REPRODUCIBILITY :
This bug can be reproduced always.
---------- BEGIN SOURCE ----------
import java.io.StringWriter;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.transform.OutputKeys;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
public class CDataBug
{
public static void main(String[] args) throws Exception
{
try (StringWriter out = new StringWriter())
{
output(createDocument(), out);
dump(out.toString());
}
}
private static Document createDocument() throws Exception
{
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document document = builder.newDocument();
// CData section with windows line separators
Element windows = document.createElement("windows");
windows.appendChild(document.createCDATASection("one\r\ntwo\r\nthree"));
// CData section with unix line separators
Element unix = document.createElement("unix");
unix.appendChild(document.createCDATASection("one\ntwo\nthree"));
Element newlines = document.createElement("newlines");
newlines.appendChild(unix);
newlines.appendChild(windows);
document.appendChild(newlines);
return document;
}
private static void dump(String text)
{
boolean showNewlines = false;
for (int i = 0, size = text.length(); i < size; i++)
{
char c = text.charAt(i);
switch (c)
{
case '[' :
{
showNewlines = true;
System.out.print(c);
break;
}
case ']' :
{
showNewlines = false;
System.out.print(c);
break;
}
case '\r' :
{
if (showNewlines)
{
System.out.print("<CR>");
continue;
}
}
case '\n' :
{
if (showNewlines)
{
System.out.print("<LF>");
}
}
default :
{
System.out.print(c);
}
}
}
}
private static void output(Document document, StringWriter out) throws Exception
{
Transformer output = TransformerFactory.newInstance().newTransformer();
output.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
output.setOutputProperty(OutputKeys.INDENT, "yes");
output.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "1");
output.transform(new DOMSource(document), new StreamResult(out));
}
}
---------- END SOURCE ----------
CUSTOMER SUBMITTED WORKAROUND :
Ensure that all line separators in CDATA sections only use unix format.
java version "1.8.0_51"
Java(TM) SE Runtime Environment (build 1.8.0_51-b16)
Java HotSpot(TM) 64-Bit Server VM (build 25.51-b03, mixed mode)
ADDITIONAL OS VERSION INFORMATION :
Microsoft Windows [Version 6.3.9600]
A DESCRIPTION OF THE PROBLEM :
Outputting a DOM document with CDATA sections that contain Windows line separators (CRLF) inserts unwanted CR characters when run on Windows, i.e. additional line breaks are inserted!
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Compile the program, execute it on Windows and you see the wrong output illustrated. Several unwanted CR characters have been inserted leading to additional line breaks.
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
The output on Windows should either be:
<newlines>
<unix><![CDATA[one<LF>
two<LF>
three]]></unix>
<windows><![CDATA[one<CR><LF>
two<CR><LF>
three]]></windows>
</newlines>
or
<newlines>
<unix><![CDATA[one<CR><LF>
two<CR><LF>
three]]></unix>
<windows><![CDATA[one<CR><LF>
two<CR><LF>
three]]></windows>
</newlines>
ACTUAL -
The output on Windows is:
<newlines>
<unix><![CDATA[one<CR><LF>
two<CR><LF>
three]]></unix>
<windows><![CDATA[one<CR><CR><LF>
two<CR><CR><LF>
three]]></windows>
</newlines>
REPRODUCIBILITY :
This bug can be reproduced always.
---------- BEGIN SOURCE ----------
import java.io.StringWriter;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.transform.OutputKeys;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
public class CDataBug
{
public static void main(String[] args) throws Exception
{
try (StringWriter out = new StringWriter())
{
output(createDocument(), out);
dump(out.toString());
}
}
private static Document createDocument() throws Exception
{
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document document = builder.newDocument();
// CData section with windows line separators
Element windows = document.createElement("windows");
windows.appendChild(document.createCDATASection("one\r\ntwo\r\nthree"));
// CData section with unix line separators
Element unix = document.createElement("unix");
unix.appendChild(document.createCDATASection("one\ntwo\nthree"));
Element newlines = document.createElement("newlines");
newlines.appendChild(unix);
newlines.appendChild(windows);
document.appendChild(newlines);
return document;
}
private static void dump(String text)
{
boolean showNewlines = false;
for (int i = 0, size = text.length(); i < size; i++)
{
char c = text.charAt(i);
switch (c)
{
case '[' :
{
showNewlines = true;
System.out.print(c);
break;
}
case ']' :
{
showNewlines = false;
System.out.print(c);
break;
}
case '\r' :
{
if (showNewlines)
{
System.out.print("<CR>");
continue;
}
}
case '\n' :
{
if (showNewlines)
{
System.out.print("<LF>");
}
}
default :
{
System.out.print(c);
}
}
}
}
private static void output(Document document, StringWriter out) throws Exception
{
Transformer output = TransformerFactory.newInstance().newTransformer();
output.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
output.setOutputProperty(OutputKeys.INDENT, "yes");
output.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "1");
output.transform(new DOMSource(document), new StreamResult(out));
}
}
---------- END SOURCE ----------
CUSTOMER SUBMITTED WORKAROUND :
Ensure that all line separators in CDATA sections only use unix format.