-
Bug
-
Resolution: Cannot Reproduce
-
P4
-
None
-
1.4.2
-
generic
-
generic
All versions of 1.4.2 on all platforms incorrectly returns
a multibyte values for '?' (replacement character) when a
unicode can not be mapped to a character in euc_jp character set.
The following program demonstrates the problem.
import java.nio.charset.*;
class CharTest
{
public static void main(String[] args)
{
try {
String unicode = "\u2015";
byte[] bytes = unicode.getBytes("EUC_JP");
for (int i=0 ; i<bytes.length; i++) {
System.out.println("0x"+Integer.toHexString(0xff &(int)bytes
[i]));
}
Charset charset = Charset.forName("EUC_JP");
System.out.println("charset EUC_JP == "+charset.displayName());
CharsetEncoder encoder = charset.newEncoder();
System.out.println(encoder.toString());
bytes = encoder.replacement();
System.out.println("replacement Value");
for (int i=0 ; i<bytes.length; i++) {
System.out.println("0x"+Integer.toHexString(0xff &(int)bytes
[i]));
}
}catch (Exception cce) {
cce.printStackTrace();
}
}
}
1.4.2 output
------------
0x21
0x29
charset EUC_JP == EUC-JP
sun.nio.cs.ext.EUC_JP$Encoder@1f12c4e
replacement Value
0x21
0x29
Other versions
--------------
0x3f
charset EUC_JP == EUC-JP
sun.nio.cs.ext.EUC_JP$Encoder@a8c4e7
replacement Value
0x3f
The problem existed in one of the earlier of 5.0 Javasoft drops, but
was fixed prior to the FCS 5.0 drop.
Here is the comment in the 5.0 file that contains the fix.
share/classes/sun/nio/cs/ext/EUC_JP.java:
public CharsetEncoder newEncoder() {
// Need to force the replacement byte to 0x3f
// because JIS_X_0208_Encoder defines its own
// alternative 2 byte substitution to permit it
// to exist as a self-standing Encoder
byte[] replacementBytes = { (byte)0x3f };
return new Encoder(this).replaceWith(replacementBytes);
}
###@###.### 2005-04-19 23:17:28 GMT
a multibyte values for '?' (replacement character) when a
unicode can not be mapped to a character in euc_jp character set.
The following program demonstrates the problem.
import java.nio.charset.*;
class CharTest
{
public static void main(String[] args)
{
try {
String unicode = "\u2015";
byte[] bytes = unicode.getBytes("EUC_JP");
for (int i=0 ; i<bytes.length; i++) {
System.out.println("0x"+Integer.toHexString(0xff &(int)bytes
[i]));
}
Charset charset = Charset.forName("EUC_JP");
System.out.println("charset EUC_JP == "+charset.displayName());
CharsetEncoder encoder = charset.newEncoder();
System.out.println(encoder.toString());
bytes = encoder.replacement();
System.out.println("replacement Value");
for (int i=0 ; i<bytes.length; i++) {
System.out.println("0x"+Integer.toHexString(0xff &(int)bytes
[i]));
}
}catch (Exception cce) {
cce.printStackTrace();
}
}
}
1.4.2 output
------------
0x21
0x29
charset EUC_JP == EUC-JP
sun.nio.cs.ext.EUC_JP$Encoder@1f12c4e
replacement Value
0x21
0x29
Other versions
--------------
0x3f
charset EUC_JP == EUC-JP
sun.nio.cs.ext.EUC_JP$Encoder@a8c4e7
replacement Value
0x3f
The problem existed in one of the earlier of 5.0 Javasoft drops, but
was fixed prior to the FCS 5.0 drop.
Here is the comment in the 5.0 file that contains the fix.
share/classes/sun/nio/cs/ext/EUC_JP.java:
public CharsetEncoder newEncoder() {
// Need to force the replacement byte to 0x3f
// because JIS_X_0208_Encoder defines its own
// alternative 2 byte substitution to permit it
// to exist as a self-standing Encoder
byte[] replacementBytes = { (byte)0x3f };
return new Encoder(this).replaceWith(replacementBytes);
}
###@###.### 2005-04-19 23:17:28 GMT