Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-4483377

wrong CharToByte and ByteToChar conversions for SJIS encoding

XMLWordPrintable

    • generic
    • generic



      Name: ooR10006 Date: 07/24/2001


      jdk1.4.0beta-b72's methods:
      OutputStreamWriter(ByteArrayOutputStream, encodingName).write(char[]) and
      InputStreamReader(ByteArrayInputStream, encodingName).read(char[], int, int)
      incorrectly write/read chars/bytes for the encoding "SJIS".
      Specifically, the problems are:

      1. The character 0x005C is encoded as (byte)0x5C instead of the 2 byte
      sequence (byte)0x81, (byte)0x5F.

      2. The byte 0x5C is decoded as char 0x005C instead of 0x00A5.

      3. 2 byte sequence (byte)0x81, (byte)0x5F is decoded as char 0xFF3C
      instead of char 0x005C.

      4. The byte 0x7E is decoded as char 0x007E instead of 0x203E, however
      the char 0x203E is correctly encoded as byte 0x7E.

      The following test shows this:

      import java.io.ByteArrayInputStream;
      import java.io.ByteArrayOutputStream;
      import java.io.InputStreamReader;
      import java.io.OutputStreamWriter;
      import java.io.IOException;
      import java.io.UnsupportedEncodingException;

      public class test {
          public static void main(String[] args){
              byteToChar();
              charToByte();
          }
          static void byteToChar(){
              char[] receivedChars = new char[4];
              byte[] inputBytes = {(byte)0x5c, (byte)0x81, (byte)0x5F, (byte)0x7e};
              char[] expectedChars = {(char)0xA5, (char)0x5C, (char)0x203E};
              try {
                  ByteArrayInputStream bais = new ByteArrayInputStream(inputBytes);
                  InputStreamReader reader = new InputStreamReader(bais, "SJIS");
                  reader.read(receivedChars, 0, receivedChars.length);
                  System.out.println("byte sequence for decoding: (byte)0x5C "
                          + "(byte)0x81, (byte)0x5F, (byte)0x7e");
                  System.out.println("decoded chars: "
                          + "0x" + Integer.toHexString(receivedChars[0]) + ", "
                          + "0x" + Integer.toHexString(receivedChars[1]) + ", "
                          + "0x" + Integer.toHexString(receivedChars[2]));
                  System.out.println("expected chars: "
                          + "0x" + Integer.toHexString(expectedChars[0]) + ", "
                          + "0x" + Integer.toHexString(expectedChars[1]) + ", "
                          + "0x" + Integer.toHexString(expectedChars[2]));
                          
              } catch(UnsupportedEncodingException e) {
                  return;
              } catch(IOException ex) {
                  return;
              }
          }
          static void charToByte(){
              char[] inputChars = {(char)0x5C};
              byte[] expectedBytes = {(byte)0x81, (byte)0x5F};
              try {
                  ByteArrayOutputStream baos = new ByteArrayOutputStream(10);
                  OutputStreamWriter writer = new OutputStreamWriter(baos, "SJIS");
                  writer.write(inputChars);
                  writer.flush();
                  byte[] bytes = baos.toByteArray();
                  System.out.println("chars for encoding: (char)0x5C ");
                  System.out.println("encoding bytes: "
                          + "0x" + Integer.toHexString(bytes[0] & 0xFF));
                  System.out.println("expected bytes: "
                          + "0x" + Integer.toHexString(expectedBytes[0] & 0xFF) + ", "
                          + "0x" + Integer.toHexString(expectedBytes[1] & 0xFF));

              } catch (UnsupportedEncodingException e) {
                  return;
              } catch (IOException ex) {
                  return;
              }
          }
      }
      % jdk1.4.0beta-b72/solsparc/bin/java test
      byte sequence for decoding: (byte)0x5C (byte)0x81, (byte)0x5F, (byte)0x7e
      decoded chars: 0x5c, 0xff3c, 0x7e
      expected chars: 0xa5, 0x5c, 0x203e
      chars for encoding: (char)0x5C
      encoding bytes: 0x5c
      expected bytes: 0x81, 0x5f
      %

      Due to this the following JCK test fails:

      api/java_io/mbCharEncoding/index.html#ShiftJIS

      The test is correct and actually fails since 1.1.6

      ======================================================================

            ilittlesunw Ian Little (Inactive)
            ovosunw Ovo Ovo (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated:
              Resolved:
              Imported:
              Indexed: