-
Bug
-
Resolution: Not an Issue
-
P4
-
None
-
1.4.2_03, 5.0
-
x86, sparc
-
solaris_9, windows_xp
NS Solutions Corp./Sony provided a foo.txt file that they say contains the following SJIS-encoded characters:
SJIS MS932
=================================================
(FULLWIDTH TILDE) 0x8160 U+FF5E
(DOUBLE VERTICAL LINE) 0x8161 U+2016
(EM DASH) 0x815C U+2014
(FULLWIDTH CENT SIGN) 0x8191 U+FFE0
(FULLWIDTH POUND SIGN) 0x8192 U+FFE1
(FULLWIDTH NOT SIGN) 0x81CA U+FFE2
I've attached this file and the Java code used to test it. I wrote the following script to run the test:
LC_ALL=ja; export LC_ALL
LANG=ja; export LANG
locale
javac Test.java
java -Dfile.encoding=MS932 Test ms932.txt
rm -f ms932sjis.txt
iconv -fSJIS -teucJP ms932.txt > ms932sjis.txt
echo "==== ms932.txt output ===="
cat ms932.txt
echo "==== ms932-sjis conversion (ms932sjis.txt) ===="
cat ms932sjis.txt
Per Bug ID 4375816, -Dfile.encoding=MS932 shouldn't make a difference, but I left it in anyway from the customer's original instructions.
Running this with Java 1.4.1 produces the following in my terminal: (Note - the bug report cannot show exactly what is output.)
taffer:gregv ~/CASES/10469592(105) run
LANG=ja
LC_CTYPE="ja"
LC_NUMERIC="ja"
LC_TIME="ja"
LC_COLLATE="ja"
LC_MONETARY="ja"
LC_MESSAGES="ja"
LC_ALL=ja
MS932
==== ms932.txt output ====
?`?a?\??????
==== ms932-sjis conversion (ms932sjis.txt) ====
????????????
But, with any other Java (1.4.2_02 demonstrated here):
taffer:gregv ~/CASES/10469592(108) run
LANG=ja
LC_CTYPE="ja"
LC_NUMERIC="ja"
LC_TIME="ja"
LC_COLLATE="ja"
LC_MONETARY="ja"
LC_MESSAGES="ja"
LC_ALL=ja
MS932
==== ms932.txt output ====
??????
==== ms932-sjis conversion (ms932sjis.txt) ====
??????
SJIS MS932
=================================================
(FULLWIDTH TILDE) 0x8160 U+FF5E
(DOUBLE VERTICAL LINE) 0x8161 U+2016
(EM DASH) 0x815C U+2014
(FULLWIDTH CENT SIGN) 0x8191 U+FFE0
(FULLWIDTH POUND SIGN) 0x8192 U+FFE1
(FULLWIDTH NOT SIGN) 0x81CA U+FFE2
I've attached this file and the Java code used to test it. I wrote the following script to run the test:
LC_ALL=ja; export LC_ALL
LANG=ja; export LANG
locale
javac Test.java
java -Dfile.encoding=MS932 Test ms932.txt
rm -f ms932sjis.txt
iconv -fSJIS -teucJP ms932.txt > ms932sjis.txt
echo "==== ms932.txt output ===="
cat ms932.txt
echo "==== ms932-sjis conversion (ms932sjis.txt) ===="
cat ms932sjis.txt
Per Bug ID 4375816, -Dfile.encoding=MS932 shouldn't make a difference, but I left it in anyway from the customer's original instructions.
Running this with Java 1.4.1 produces the following in my terminal: (Note - the bug report cannot show exactly what is output.)
taffer:gregv ~/CASES/10469592(105) run
LANG=ja
LC_CTYPE="ja"
LC_NUMERIC="ja"
LC_TIME="ja"
LC_COLLATE="ja"
LC_MONETARY="ja"
LC_MESSAGES="ja"
LC_ALL=ja
MS932
==== ms932.txt output ====
?`?a?\??????
==== ms932-sjis conversion (ms932sjis.txt) ====
????????????
But, with any other Java (1.4.2_02 demonstrated here):
taffer:gregv ~/CASES/10469592(108) run
LANG=ja
LC_CTYPE="ja"
LC_NUMERIC="ja"
LC_TIME="ja"
LC_COLLATE="ja"
LC_MONETARY="ja"
LC_MESSAGES="ja"
LC_ALL=ja
MS932
==== ms932.txt output ====
??????
==== ms932-sjis conversion (ms932sjis.txt) ====
??????