Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-4971863

SJIS/MS932 encodings handled by J2SE 1.4.1 but not 1.4.2

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Not an Issue
    • Icon: P4 P4
    • None
    • 1.4.2_03, 5.0
    • core-libs

      NS Solutions Corp./Sony provided a foo.txt file that they say contains the following SJIS-encoded characters:

                                  SJIS MS932
      =================================================
      (FULLWIDTH TILDE) 0x8160 U+FF5E
      (DOUBLE VERTICAL LINE) 0x8161 U+2016
      (EM DASH) 0x815C U+2014
      (FULLWIDTH CENT SIGN) 0x8191 U+FFE0
      (FULLWIDTH POUND SIGN) 0x8192 U+FFE1
      (FULLWIDTH NOT SIGN) 0x81CA U+FFE2


      I've attached this file and the Java code used to test it. I wrote the following script to run the test:


      LC_ALL=ja; export LC_ALL
      LANG=ja; export LANG
      locale
      javac Test.java
      java -Dfile.encoding=MS932 Test ms932.txt
      rm -f ms932sjis.txt
      iconv -fSJIS -teucJP ms932.txt > ms932sjis.txt
      echo "==== ms932.txt output ===="
      cat ms932.txt
      echo "==== ms932-sjis conversion (ms932sjis.txt) ===="
      cat ms932sjis.txt


      Per Bug ID 4375816, -Dfile.encoding=MS932 shouldn't make a difference, but I left it in anyway from the customer's original instructions.

      Running this with Java 1.4.1 produces the following in my terminal: (Note - the bug report cannot show exactly what is output.)

      taffer:gregv ~/CASES/10469592(105) run
      LANG=ja
      LC_CTYPE="ja"
      LC_NUMERIC="ja"
      LC_TIME="ja"
      LC_COLLATE="ja"
      LC_MONETARY="ja"
      LC_MESSAGES="ja"
      LC_ALL=ja
      MS932
      ==== ms932.txt output ====
      ?`?a?\??????
      ==== ms932-sjis conversion (ms932sjis.txt) ====
      ????????????


      But, with any other Java (1.4.2_02 demonstrated here):

      taffer:gregv ~/CASES/10469592(108) run
      LANG=ja
      LC_CTYPE="ja"
      LC_NUMERIC="ja"
      LC_TIME="ja"
      LC_COLLATE="ja"
      LC_MONETARY="ja"
      LC_MESSAGES="ja"
      LC_ALL=ja
      MS932
      ==== ms932.txt output ====
      ??????
      ==== ms932-sjis conversion (ms932sjis.txt) ====
      ??????

            sherman Xueming Shen
            duke J. Duke
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated:
              Resolved:
              Imported:
              Indexed: