Loading...

XML

Word

Printable

Type: Bug
Resolution: Not an Issue
Priority: P2
Fix Version/s: None
Affects Version/s: 6
Component/s: core-libs
Labels:
- webbug

Subcomponent:
java.nio.charsets
CPU:

x86
OS:

windows_xp

FULL PRODUCT VERSION :
java version "1.6.0-beta2"
Java(TM) SE Runtime Environment (build 1.6.0-beta2-b86)
Java HotSpot(TM) Client VM (build 1.6.0-beta2-b86, mixed mode, sharing)

ADDITIONAL OS VERSION INFORMATION :
Windows XP Professional SP 2

A DESCRIPTION OF THE PROBLEM :
This bug is responsible for the following behavior:
Some UTF-16 characters can't be put into a JDOM after they have been encoded using the CharsetEncoder. The returning ByteBuffer contains a null byte at the end. This zero byte seems to be responsible for the error while building the DOM.

Also there is a difference in version 1.5.0_07 compared to version 1.6.0 (b86). The character which causes this behaviour is different:

"u\0237" - version 1.5.0_07 OK, version 1.6.0 NOK
"u\304E" - version 1.5.0_07 NOK, version 1.6.0 OK

STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Run the class CharsetEncoderTest twice, one time with java 1.5.0_07 and the second time with Java 1.6.0 b86...

EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
CharsetEncoder should encode the two Unicode (UTF-16) characters into UTF-8 Characters, which then could be used as the Text of an XML DOM entry.
ACTUAL -
XML-DOM should accept the encoded String generated out of the ByteBuffer which returned from the CharsetEncoder.

The ByteBuffer contained a additional "empty" byte with the value = 0.

(This behavior occurs in both java versions mentioned, but with different characters...

ERROR MESSAGES/STACK TRACES THAT OCCUR :
Exception in thread "main" org.jdom.IllegalDataException: The data "AA " is not legal for a JDOM attribute: 0x0 is not a legal XML character.
at org.jdom.Attribute.setValue(Attribute.java:486)
at org.jdom.Attribute.<init>(Attribute.java:229)
at org.jdom.Attribute.<init>(Attribute.java:252)
at org.jdom.Element.setAttribute(Element.java:1109)
at test.CharsetEncoderTest.testEncodeSaveXML(CharsetEncoderTest.java:39)
at test.CharsetEncoderTest.main(CharsetEncoderTest.java:20)

!!! NOTE !!!: The space in the String "AA " was not a space in the original Error Message. It was an undisplayable Character.

REPRODUCIBILITY :
This bug can be reproduced always.

---------- BEGIN SOURCE ----------
import java.io.UnsupportedEncodingException;
import java.nio.ByteBuffer;
import java.nio.CharBuffer;
import java.nio.charset.CharacterCodingException;
import java.nio.charset.Charset;
import java.nio.charset.CharsetEncoder;

import org.jdom.Document;
import org.jdom.Element;

public class CharsetEncoderTest {

    private static int encodee160 = 0x304E; // Works only with version 1.6.0
    private static int encodee150_07 = 0x237; // Works only with version 1.5.0_07
    private static String encoded;

    public static void main(String[] args) {
        testEncodeSaveXML(encodee150_07);
        testEncodeSaveXML(encodee160);
    }

    public static void testEncodeSaveXML(int character) {
        Charset set = Charset.forName("UTF-8");
        CharsetEncoder encoder = set.newEncoder();
        CharBuffer chb = CharBuffer.allocate(1);
        chb.put((char) character);
        chb.rewind();
        encoder.reset();
        try {
            ByteBuffer bb;
            bb = encoder.encode(chb);
            byte[] ba = bb.array();
            encoded = new String(ba, "ISO-8859-1");
            Document doc = new Document();
            Element e = new Element("XMLChar");
            e.setAttribute("value", encoded);
            doc.setRootElement(e);
        } catch (CharacterCodingException e) {
            e.printStackTrace();
        } catch (UnsupportedEncodingException e) {
            e.printStackTrace();
        }
    }
}

---------- END SOURCE ----------

CUSTOMER SUBMITTED WORKAROUND :
Removing the last (wrong) character from the encoded String before processing if encoding resulted in a null byte...

Assignee:: Martin Buchholz

Reporter:: Girish Manwani (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 0 Start watching this issue

Created:: 2006-06-23 14:41

Updated:: 2010-04-02 15:29

Resolved:: 2006-06-23 17:18

Imported:: 15/Sep/12 1:24 PM

Indexed:: 17/Jul/12 10:55 AM

Details

Description

Attachments

Activity

People

Dates