Some Japanese characters have a problem while using the “JISAutoDetect” encoding with Java1.4.2_26. The issue does not fail in JDK 5.0_22 or 6.0_19.
Steps to run the test case
1. Set LANG to 'ja_JP.PCK'
2. Javac and run.
java moji2 | od -x
Here is the test result.
JDK 1.4.2_26
/opt/java14224/bin/java moji2 | od -x
0000000 815b 3f8a 8740 3f54 3f41 3f55 0a00
==
So the first byte of 0x878a is converted to 0x3f.
JDK 6u19
/opt/java16006/bin/java moji2 | od -x
0000000 815b 878a 8740 8754 8741 8755 0a00
====
The value 0x878a, is the correct result.
Testcase is attached, and appended below:
import java.io.*;
import java.net.*;
public class moji2 {
public static void main(String[] args) {
try{
String str = "¿[¿¿¿@¿T¿A¿U";
byte[] bytes = str.getBytes("MS932");
String str2 = new String(bytes, "JISAutoDetect");
// String str2 = new String(bytes, "MS932");
System.out.println(str2);
}catch(UnsupportedEncodingException e){
e.printStackTrace();
}
}
}
Steps to run the test case
1. Set LANG to 'ja_JP.PCK'
2. Javac and run.
java moji2 | od -x
Here is the test result.
JDK 1.4.2_26
/opt/java14224/bin/java moji2 | od -x
0000000 815b 3f8a 8740 3f54 3f41 3f55 0a00
==
So the first byte of 0x878a is converted to 0x3f.
JDK 6u19
/opt/java16006/bin/java moji2 | od -x
0000000 815b 878a 8740 8754 8741 8755 0a00
====
The value 0x878a, is the correct result.
Testcase is attached, and appended below:
import java.io.*;
import java.net.*;
public class moji2 {
public static void main(String[] args) {
try{
String str = "¿[¿¿¿@¿T¿A¿U";
byte[] bytes = str.getBytes("MS932");
String str2 = new String(bytes, "JISAutoDetect");
// String str2 = new String(bytes, "MS932");
System.out.println(str2);
}catch(UnsupportedEncodingException e){
e.printStackTrace();
}
}
}