-
Bug
-
Resolution: Fixed
-
P3
-
7u60, 8u72
-
b119
-
x86_64
-
windows_7
Issue | Fix Version | Assignee | Priority | Status | Resolution | Resolved In Build |
---|---|---|---|---|---|---|
JDK-8163657 | 8u121 | Aleksej Efimov | P3 | Resolved | Fixed | b01 |
JDK-8156955 | 8u112 | Aleksej Efimov | P3 | Resolved | Fixed | b01 |
JDK-8157095 | 8u111 | Aleksej Efimov | P3 | Resolved | Fixed | b01 |
JDK-8157056 | 8u102 | Aleksej Efimov | P3 | Resolved | Fixed | b08 |
JDK-8167773 | emb-8u121 | Aleksej Efimov | P3 | Resolved | Fixed | b01 |
JDK-8162136 | emb-8u111 | Aleksej Efimov | P3 | Resolved | Fixed | b01 |
JDK-8157003 | 7u121 | Aleksej Efimov | P3 | Resolved | Fixed | b01 |
JDK-8157059 | 7u111 | Aleksej Efimov | P3 | Resolved | Fixed | b08 |
1.7.0_65
ADDITIONAL OS VERSION INFORMATION :
Microsoft Windows [Version 6.1.7601]
A DESCRIPTION OF THE PROBLEM :
I have narrowed down a problem where our application produced XML which it could not parse back. The XML contained "character references", but the reference had an invalid value (there are valid ranges fro them in XML). It turned out that these character references are generated specifically for characters outside the BMP, i.e. are encoded using a surrogate pair. Further investigation revealed that this happens only when constructing the XMLStreamWriter with an OutputStreamWriter. The surrogates are encoded as valid UTF-8 multibytes sequences when usign a plain OutputStream. The error can however not be in the OutputStreamWriter, since the character references are specific to XML files of which the OutputStreamWriter knows nothing.
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
I am attaching a test program which clearly demonstrates the problem.
REPRODUCIBILITY :
This bug can be reproduced always.
---------- BEGIN SOURCE ----------
package com.dramaqueen.exporters;
import static org.junit.Assert.*;
import java.io.ByteArrayInputStream;
import java.io.InputStream;
import java.io.OutputStreamWriter;
import java.io.UnsupportedEncodingException;
import javax.xml.stream.XMLInputFactory;
import javax.xml.stream.XMLOutputFactory;
import javax.xml.stream.XMLStreamException;
import javax.xml.stream.XMLStreamReader;
import javax.xml.stream.XMLStreamWriter;
import org.junit.Test;
import com.sun.xml.internal.messaging.saaj.util.ByteOutputStream;
@SuppressWarnings("nls")
public class StreamVersusWriterTest {
@Test
public void streamVersusWriter() {
String charset = "UTF-8";
ByteOutputStream streamA = new ByteOutputStream();
ByteOutputStream streamB = new ByteOutputStream();
XMLOutputFactory factory = XMLOutputFactory.newInstance();
try {
XMLStreamWriter writerA = factory.createXMLStreamWriter(streamA,
charset);
generateXML(writerA, charset);
OutputStreamWriter streamWriter = new OutputStreamWriter(streamB,
charset);
XMLStreamWriter writerB = factory.createXMLStreamWriter(
streamWriter);
generateXML(writerB, charset);
String outputA = streamA.toString();
String outputB = streamB.toString();
System.out.println("output using OutputStream : " + outputA);
System.out.println("output using OutputStreamWriter: " + outputB);
// assertEquals(outputA, outputB);
readXML(outputA.getBytes(charset), charset);
readXML(outputB.getBytes(charset), charset);
} catch (XMLStreamException e) {
e.printStackTrace();
// assertTrue(false);
} catch (UnsupportedEncodingException e) {
e.printStackTrace();
// assertTrue(false);
}
}
private void generateXML(XMLStreamWriter writer, String charset)
throws XMLStreamException {
// Char sequence containing a smiley which is encoded as a surrogate
// pair in the Java string
String sequence = "A😊�Bß";
writer.writeStartDocument(charset, "1.0");
writer.writeStartElement("a");
writer.writeCharacters(sequence);
writer.writeEndElement();
writer.writeEndDocument();
writer.flush();
}
private void readXML(byte[] xmlData, String charset)
throws XMLStreamException {
InputStream stream = new ByteArrayInputStream(xmlData);
XMLInputFactory factory = XMLInputFactory.newInstance();
XMLStreamReader xmlReader
= factory.createXMLStreamReader(stream, charset);
while (xmlReader.hasNext())
xmlReader.next();
}
}
---------- END SOURCE ----------
CUSTOMER SUBMITTED WORKAROUND :
Use OutputStream, not OutpuStreamWriter
- backported by
-
JDK-8156955 XMLStreamWriter produces invalid XML for surrogate pairs on OutputStreamWriter
- Resolved
-
JDK-8157003 XMLStreamWriter produces invalid XML for surrogate pairs on OutputStreamWriter
- Resolved
-
JDK-8157056 XMLStreamWriter produces invalid XML for surrogate pairs on OutputStreamWriter
- Resolved
-
JDK-8157059 XMLStreamWriter produces invalid XML for surrogate pairs on OutputStreamWriter
- Resolved
-
JDK-8157095 XMLStreamWriter produces invalid XML for surrogate pairs on OutputStreamWriter
- Resolved
-
JDK-8162136 XMLStreamWriter produces invalid XML for surrogate pairs on OutputStreamWriter
- Resolved
-
JDK-8163657 XMLStreamWriter produces invalid XML for surrogate pairs on OutputStreamWriter
- Resolved
-
JDK-8167773 XMLStreamWriter produces invalid XML for surrogate pairs on OutputStreamWriter
- Resolved
- duplicates
-
JDK-8073700 XMLStreamWriter outputs Unicode extended characters (non-BMP) incorrectly
- Closed
- relates to
-
JDK-8276207 Properties.loadFromXML/storeToXML works incorrectly for supplementary characters
- Closed