-
Bug
-
Resolution: Duplicate
-
P3
-
None
-
9, 10, 11
-
x86_64
-
generic
A DESCRIPTION OF THE PROBLEM :
It looks like a bug exists where a new line is added to within CDATA sections when a XSLT is used.
It is not clear if this is the correct place to raise a bug, as I think the issue is in com.sun.org.apache packages, but I can't see where they are maintained or if they are modified to be included in the JVM.
When writing out value 'ABCDEFGHIJKLMNOPQRST985' a new line is added within the value. this can be seen in:
<p>
<![CDATA[ABCDEFGHIJKLMNOPQRST
985]]>
</p>
yet it should look like:
<p>
<![CDATA[ABCDEFGHIJKLMNOPQRST985]]>
</p>
The problem is easy to reproduce, see below.
I suspect the problem has something to do with:
com.sun.org.apache.xml.internal.util.FastStringBuffer#sendSAXcharacters
The issue happens when the data for a single value is spread over multiple arrays in m_array.
Within com.sun.org.apache.xml.internal.serializer.ToStream#cdata it does adds the new line and indent in the middle of writing the values for a CDATA section. This happens at these lines withing the method:
if (shouldIndent()) <-- this returns true in the middle of cdata
indent();
This looks fixed in jdk12 however that is not released yet.
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Create a XML with many many values, I think at least 1KB of data is needed. Then write a XSLT to write out all of those values within CDATA sections.
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
Values within CDATA sections should not have occasional new lines and indents added
ACTUAL -
Values within CDATA sections have occasional new lines and indents added to the values.
---------- BEGIN SOURCE ----------
import static java.nio.charset.StandardCharsets.UTF_8;
import org.junit.Assert;
import org.junit.Test;
import java.io.ByteArrayOutputStream;
import java.io.StringReader;
import javax.xml.transform.Templates;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.stream.StreamResult;
import javax.xml.transform.stream.StreamSource;
/**
* Domonstrates an issue with new line + indent being added to cdata sections
*
* The issue can be seen in the 2nd cdata section here:
*
* <?xml version="1.0" encoding="UTF-8" standalone="no"?>
* <root>
* <inner>
* <p>
* <![CDATA[ABCDEFGHIJKLMNOPQRST984]]>
* </p>
* <p>
* <![CDATA[ABCDEFGHIJKLMNOPQRST
* 985]]>
* </p>
* <p>
* <![CDATA[ABCDEFGHIJKLMNOPQRST986]]>
* </p>
* </inner>
* </root>
*
*
*/
public class XMLIndentTest {
private static final String VALUE_PREFIX = "ABCDEFGHIJKLMNOPQRST";
@Test
public void test() throws Exception {
String xml = makeXml();
Templates templates
= TransformerFactory.newInstance().newTemplates(new StreamSource(new StringReader(XSLT)));
ByteArrayOutputStream bos = new ByteArrayOutputStream();
templates.newTransformer().transform(new StreamSource(new StringReader(xml)),
new StreamResult(bos));
String res = new String(bos.toByteArray(), UTF_8);
System.out.println(res);
Assert.assertTrue(res.contains("<![CDATA[ABCDEFGHIJKLMNOPQRST984]]>"));
Assert.assertTrue(res.contains("<![CDATA[ABCDEFGHIJKLMNOPQRST986]]>"));
Assert.assertTrue(res.contains("<![CDATA[ABCDEFGHIJKLMNOPQRST985]]>"));
}
private String makeXml() {
StringBuffer sb = new StringBuffer();
sb.append("<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"no\"?>\n");
sb.append("<root>\n");
sb.append("<inner>\n");
for(int i = 0; i < 984; i++) {
sb.append("<cd1><v>")
.append(VALUE_PREFIX)
.append(i)
.append("</v></cd1>\n");
}
for(int i = 984; i < 987; i++) {
sb.append("<cd><v>")
.append(VALUE_PREFIX)
.append(i)
.append("</v></cd>\n");
}
sb.append("</inner>\n");
sb.append("</root>\n");
return sb.toString();
}
private static final String XSLT = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n" +
"<xsl:stylesheet version=\"2.0\" xmlns:xsl=\"http://www.w3.org/1999/XSL/Transform\">\n" +
"<xsl:output\n" +
" media-type=\"text/xml\"\n" +
" encoding=\"UTF-8\"\n" +
" method=\"xml\"\n" +
" indent=\"yes\"\n" +
" cdata-section-elements=\"p\"\n" +
" standalone=\"no\" />\n" +
"\n" +
"<xsl:template match=\"@*|node()\" />\n" +
"\n" +
"<xsl:template match=\"/\">\n" +
" <root>\n" +
" <inner>\n" +
" <xsl:for-each select=\"root/inner/cd\">\n" +
" <p>" +
" <xsl:value-of select=\"v\" />\n" +
" </p>" +
" </xsl:for-each>\n" +
" </inner>\n" +
" </root>\n" +
"</xsl:template>\n" +
"</xsl:stylesheet>";
}
---------- END SOURCE ----------
FREQUENCY : always
It looks like a bug exists where a new line is added to within CDATA sections when a XSLT is used.
It is not clear if this is the correct place to raise a bug, as I think the issue is in com.sun.org.apache packages, but I can't see where they are maintained or if they are modified to be included in the JVM.
When writing out value 'ABCDEFGHIJKLMNOPQRST985' a new line is added within the value. this can be seen in:
<p>
<![CDATA[ABCDEFGHIJKLMNOPQRST
985]]>
</p>
yet it should look like:
<p>
<![CDATA[ABCDEFGHIJKLMNOPQRST985]]>
</p>
The problem is easy to reproduce, see below.
I suspect the problem has something to do with:
com.sun.org.apache.xml.internal.util.FastStringBuffer#sendSAXcharacters
The issue happens when the data for a single value is spread over multiple arrays in m_array.
Within com.sun.org.apache.xml.internal.serializer.ToStream#cdata it does adds the new line and indent in the middle of writing the values for a CDATA section. This happens at these lines withing the method:
if (shouldIndent()) <-- this returns true in the middle of cdata
indent();
This looks fixed in jdk12 however that is not released yet.
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Create a XML with many many values, I think at least 1KB of data is needed. Then write a XSLT to write out all of those values within CDATA sections.
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
Values within CDATA sections should not have occasional new lines and indents added
ACTUAL -
Values within CDATA sections have occasional new lines and indents added to the values.
---------- BEGIN SOURCE ----------
import static java.nio.charset.StandardCharsets.UTF_8;
import org.junit.Assert;
import org.junit.Test;
import java.io.ByteArrayOutputStream;
import java.io.StringReader;
import javax.xml.transform.Templates;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.stream.StreamResult;
import javax.xml.transform.stream.StreamSource;
/**
* Domonstrates an issue with new line + indent being added to cdata sections
*
* The issue can be seen in the 2nd cdata section here:
*
* <?xml version="1.0" encoding="UTF-8" standalone="no"?>
* <root>
* <inner>
* <p>
* <![CDATA[ABCDEFGHIJKLMNOPQRST984]]>
* </p>
* <p>
* <![CDATA[ABCDEFGHIJKLMNOPQRST
* 985]]>
* </p>
* <p>
* <![CDATA[ABCDEFGHIJKLMNOPQRST986]]>
* </p>
* </inner>
* </root>
*
*
*/
public class XMLIndentTest {
private static final String VALUE_PREFIX = "ABCDEFGHIJKLMNOPQRST";
@Test
public void test() throws Exception {
String xml = makeXml();
Templates templates
= TransformerFactory.newInstance().newTemplates(new StreamSource(new StringReader(XSLT)));
ByteArrayOutputStream bos = new ByteArrayOutputStream();
templates.newTransformer().transform(new StreamSource(new StringReader(xml)),
new StreamResult(bos));
String res = new String(bos.toByteArray(), UTF_8);
System.out.println(res);
Assert.assertTrue(res.contains("<![CDATA[ABCDEFGHIJKLMNOPQRST984]]>"));
Assert.assertTrue(res.contains("<![CDATA[ABCDEFGHIJKLMNOPQRST986]]>"));
Assert.assertTrue(res.contains("<![CDATA[ABCDEFGHIJKLMNOPQRST985]]>"));
}
private String makeXml() {
StringBuffer sb = new StringBuffer();
sb.append("<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"no\"?>\n");
sb.append("<root>\n");
sb.append("<inner>\n");
for(int i = 0; i < 984; i++) {
sb.append("<cd1><v>")
.append(VALUE_PREFIX)
.append(i)
.append("</v></cd1>\n");
}
for(int i = 984; i < 987; i++) {
sb.append("<cd><v>")
.append(VALUE_PREFIX)
.append(i)
.append("</v></cd>\n");
}
sb.append("</inner>\n");
sb.append("</root>\n");
return sb.toString();
}
private static final String XSLT = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n" +
"<xsl:stylesheet version=\"2.0\" xmlns:xsl=\"http://www.w3.org/1999/XSL/Transform\">\n" +
"<xsl:output\n" +
" media-type=\"text/xml\"\n" +
" encoding=\"UTF-8\"\n" +
" method=\"xml\"\n" +
" indent=\"yes\"\n" +
" cdata-section-elements=\"p\"\n" +
" standalone=\"no\" />\n" +
"\n" +
"<xsl:template match=\"@*|node()\" />\n" +
"\n" +
"<xsl:template match=\"/\">\n" +
" <root>\n" +
" <inner>\n" +
" <xsl:for-each select=\"root/inner/cd\">\n" +
" <p>" +
" <xsl:value-of select=\"v\" />\n" +
" </p>" +
" </xsl:for-each>\n" +
" </inner>\n" +
" </root>\n" +
"</xsl:template>\n" +
"</xsl:stylesheet>";
}
---------- END SOURCE ----------
FREQUENCY : always
- relates to
-
JDK-8207760 SAXException: Invalid UTF-16 surrogate detected: d83c ?
- Resolved