Loading...

XML

Word

Printable

Type: Bug
Resolution: Cannot Reproduce
Priority: P3
Fix Version/s: None
Affects Version/s: 6
Component/s: xml
Labels:
- 6-wnf
- 7-wnf
- 8-wnf
- JAXP
- regression
- webbug

Subcomponent:
org.w3c.dom
CPU:

x86
OS:

windows_xp

FULL PRODUCT VERSION :
java version "1.6.0_10"
Java(TM) SE Runtime Environment (build 1.6.0_10-b33)
Java HotSpot(TM) Client VM (build 11.0-b15, mixed mode, sharing)

ADDITIONAL OS VERSION INFORMATION :
Windows XP

A DESCRIPTION OF THE PROBLEM :
If a document contains a traditional chinese (4-bytes UTF-8 character) after a numeric character reference, the resulting DOM has garbage characters.

STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Run the test case.

EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
All tests should be successful.

ACTUAL -
testCharRefAndRawChineseChar() fails.
The characters of the the numeric reference itself are inserted before the unescaped chinese character( "80" in the test case).

REPRODUCIBILITY :
This bug can be reproduced always.

---------- BEGIN SOURCE ----------
import java.io.ByteArrayInputStream;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;

import junit.framework.TestCase;

import org.w3c.dom.Document;

public class XMLChineseTest extends TestCase {


    static final String CHINESE_STR = new String(Character.toChars(65766));


    public void testRawChineseChar() throws Exception {

        checkXMLParsing(CHINESE_STR, CHINESE_STR);
    }


    public void testCharRefAndEscapedChineseChar() throws Exception {

        checkXMLParsing("P𐃦", (char)(80) + CHINESE_STR);
    }


    public void testCharRefAndRawChineseChar() throws Exception {

        checkXMLParsing("P" + CHINESE_STR, (char)(80) + CHINESE_STR);
    }


    private void checkXMLParsing(String encodedValue, String expectedDOMValue) throws Exception {

        String xml = "<truc value=\"" + encodedValue + "\" />";
        System.out.println("xml input: " + xml);
        byte[] xmlBytes = xml.getBytes("UTF-8");

        DocumentBuilder builder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
        Document doc = builder.parse(new ByteArrayInputStream(xmlBytes));

        String readValue = doc.getDocumentElement().getAttribute("value");
        System.out.println("Read value: " + readValue);
        assertEquals(expectedDOMValue, readValue);
    }
}

Release Regression From : 5.0u12
The above release value was the last known release where this
bug was not reproducible. Since then there has been a regression.

Assignee:: Joe Wang

Reporter:: Nelson Dcosta (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Created:: 2008-12-08 01:35

Updated:: 2017-04-18 17:12

Resolved:: 2017-04-18 17:12

Imported:: 17/Sep/12 5:22 PM

Indexed:: 20/Jul/12 8:26 AM

Details

Description

Attachments

Activity

People

Dates