Name: nl37777 Date: 04/18/2004
The specification for the Java Debug Wire Protocol states
that strings are encoded in UTF-8. Since UTF-8 is a widely known and
implemented standard, and there's no indication that anything else
could be meant, this should be understood to mean standard UTF-8. The
front end implements this correctly in the
com.sun.tools.jdi.PacketStream class by using the character conversion
facilities of the String class and specifying "UTF-8" as the character
encoding.
Unfortunately, the back end implementation passes string byte sequences
back and forth between JDWP and JNI interfaces as if both interfaces
used the same representation. This is not the case: JNI uses a modified
form of UTF-8, which is incompatible with standard UTF-8 in its
handling of supplementary characters and the null character.
The implementation should be corrected to convert between standard and
modified UTF-8 where necessary, i.e., at least when supplementary
characters or the null character are present in a string.
======================================================================
###@###.### 10/7/04 23:57 GMT
The specification for the Java Debug Wire Protocol states
that strings are encoded in UTF-8. Since UTF-8 is a widely known and
implemented standard, and there's no indication that anything else
could be meant, this should be understood to mean standard UTF-8. The
front end implements this correctly in the
com.sun.tools.jdi.PacketStream class by using the character conversion
facilities of the String class and specifying "UTF-8" as the character
encoding.
Unfortunately, the back end implementation passes string byte sequences
back and forth between JDWP and JNI interfaces as if both interfaces
used the same representation. This is not the case: JNI uses a modified
form of UTF-8, which is incompatible with standard UTF-8 in its
handling of supplementary characters and the null character.
The implementation should be corrected to convert between standard and
modified UTF-8 where necessary, i.e., at least when supplementary
characters or the null character are present in a string.
======================================================================
###@###.### 10/7/04 23:57 GMT