-
Bug
-
Resolution: Unresolved
-
P4
-
None
-
5.0
-
x86
-
windows_xp
FULL PRODUCT VERSION :
java version "1.5.0_01"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_01-b08)
Java HotSpot(TM) Client VM (build 1.5.0_01-b08, mixed mode)
J2SE 5.0 Update 8.
ADDITIONAL OS VERSION INFORMATION :
Microsoft Windows XP [Version 5.1.2600]
A DESCRIPTION OF THE PROBLEM :
When trying to transform anything with a default XSLT transformer TRAX (com.sun.org.apache.xalan.internal.xsltc) and 8-bit output encoding, one always gets &#xxxx; for characters with ASCII code larger than 127.
This is because when writing to the output stream, transformer tries to check whether the character it plans to write is representable in the given encoding. To do it, it uses com.sun.org.apache.xml.internal.serializer.Encodings to get CharToByteConverter. However, com.sun.org.apache.xml.internal.serializer.Encodings#findCharToByteConverterMethod() (Encodings.java file, line 68) method does ALWAYS return "null" because the value of AccessController.doPrivileged(new PrivilegedAction() {...} ) call (Encodings.java, line 71) is ignored.
To fix this bug, one needs to add "return (Method) " to the beginning of the line 71 of Encodings.java file.
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Create file "input.xml' with the following contents:
<?xml version="1.0" encoding="windows-1251"?>
<root print = " " />
(if you cannot read cyrillic, you can use any other encoding that uses 8-bit characters)
Create file "style.xsl" with the following contents:
<?xml version="1.0" encoding="windows-1251"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" >
<xsl:output
method="xml"
encoding="windows-1251"
/>
<xsl:template match = "root">
<root print = "{@print}"/>
</xsl:template>
</xsl:stylesheet>
Use any means to apply stylesheet "style.xsl" to "input.xml" using default Java's transformer. You can use Apache ANT or, for example, the code described below.
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
If we fix Encodings.java in the way described above, we get
<?xml version="1.0" encoding="windows-1251"?><root print = " " />
just as expected.
ACTUAL -
Because of the bug described above the resulting file looks like:
<?xml version="1.0" encoding="windows-1251"?><root print="Привет"/>
So we get a windows-1251 to escaped variant transformation, which is probably not what was intended.
REPRODUCIBILITY :
This bug can be reproduced always.
---------- BEGIN SOURCE ----------
import org.xml.sax.*;
import javax.xml.transform.*;
import javax.xml.transform.stream.*;
import javax.xml.transform.sax.*;
import javax.xml.parsers.*;
import java.io.*;
public class Test {
public static void main(String[] args) throws Exception {
SAXParserFactory spFactory = SAXParserFactory.newInstance();
spFactory.setNamespaceAware(true);
XMLReader reader = spFactory.newSAXParser().getXMLReader();
InputStream style = new FileInputStream("style.xsl");
Source styleSrc = new SAXSource(reader, new InputSource(style));
TransformerFactory tFactory = TransformerFactory.newInstance();
Transformer transformer = tFactory.newTransformer(styleSrc);
InputStream in = new FileInputStream("input.xml");
Source inSrc = new SAXSource(reader, new InputSource(in));
OutputStream out = new FileOutputStream("output.xml");
Result outRes = new StreamResult(out);
transformer.transform(inSrc, outRes);
style.close();
in.close();
out.close();
}
}
---------- END SOURCE ----------
CUSTOMER SUBMITTED WORKAROUND :
Fix
<jdk-dir>\src\com\sun\org\apache\xml\internal\serializer\Encodings.java by adding "return (Method)" in the beginning of the line 71, recompile it.
Unpack <jdk-dir>\jre\lib\rt.jar, replace Encodings.class with the new version, repack it and replace the original file.
java version "1.5.0_01"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_01-b08)
Java HotSpot(TM) Client VM (build 1.5.0_01-b08, mixed mode)
J2SE 5.0 Update 8.
ADDITIONAL OS VERSION INFORMATION :
Microsoft Windows XP [Version 5.1.2600]
A DESCRIPTION OF THE PROBLEM :
When trying to transform anything with a default XSLT transformer TRAX (com.sun.org.apache.xalan.internal.xsltc) and 8-bit output encoding, one always gets &#xxxx; for characters with ASCII code larger than 127.
This is because when writing to the output stream, transformer tries to check whether the character it plans to write is representable in the given encoding. To do it, it uses com.sun.org.apache.xml.internal.serializer.Encodings to get CharToByteConverter. However, com.sun.org.apache.xml.internal.serializer.Encodings#findCharToByteConverterMethod() (Encodings.java file, line 68) method does ALWAYS return "null" because the value of AccessController.doPrivileged(new PrivilegedAction() {...} ) call (Encodings.java, line 71) is ignored.
To fix this bug, one needs to add "return (Method) " to the beginning of the line 71 of Encodings.java file.
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Create file "input.xml' with the following contents:
<?xml version="1.0" encoding="windows-1251"?>
<root print = " " />
(if you cannot read cyrillic, you can use any other encoding that uses 8-bit characters)
Create file "style.xsl" with the following contents:
<?xml version="1.0" encoding="windows-1251"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" >
<xsl:output
method="xml"
encoding="windows-1251"
/>
<xsl:template match = "root">
<root print = "{@print}"/>
</xsl:template>
</xsl:stylesheet>
Use any means to apply stylesheet "style.xsl" to "input.xml" using default Java's transformer. You can use Apache ANT or, for example, the code described below.
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
If we fix Encodings.java in the way described above, we get
<?xml version="1.0" encoding="windows-1251"?><root print = " " />
just as expected.
ACTUAL -
Because of the bug described above the resulting file looks like:
<?xml version="1.0" encoding="windows-1251"?><root print="Привет"/>
So we get a windows-1251 to escaped variant transformation, which is probably not what was intended.
REPRODUCIBILITY :
This bug can be reproduced always.
---------- BEGIN SOURCE ----------
import org.xml.sax.*;
import javax.xml.transform.*;
import javax.xml.transform.stream.*;
import javax.xml.transform.sax.*;
import javax.xml.parsers.*;
import java.io.*;
public class Test {
public static void main(String[] args) throws Exception {
SAXParserFactory spFactory = SAXParserFactory.newInstance();
spFactory.setNamespaceAware(true);
XMLReader reader = spFactory.newSAXParser().getXMLReader();
InputStream style = new FileInputStream("style.xsl");
Source styleSrc = new SAXSource(reader, new InputSource(style));
TransformerFactory tFactory = TransformerFactory.newInstance();
Transformer transformer = tFactory.newTransformer(styleSrc);
InputStream in = new FileInputStream("input.xml");
Source inSrc = new SAXSource(reader, new InputSource(in));
OutputStream out = new FileOutputStream("output.xml");
Result outRes = new StreamResult(out);
transformer.transform(inSrc, outRes);
style.close();
in.close();
out.close();
}
}
---------- END SOURCE ----------
CUSTOMER SUBMITTED WORKAROUND :
Fix
<jdk-dir>\src\com\sun\org\apache\xml\internal\serializer\Encodings.java by adding "return (Method)" in the beginning of the line 71, recompile it.
Unpack <jdk-dir>\jre\lib\rt.jar, replace Encodings.class with the new version, repack it and replace the original file.