Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-4426470

String.getBytes() method does not convert some Big5 characters correctly

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Duplicate
    • Icon: P4 P4
    • None
    • 1.1.7, 1.4.0
    • core-libs

      I have attached the Java program that will reproduce the bug. The program tries to convert the unicode byte array in Big5 encoding to a String and then get the bytes back by doing String.getBytes("Big5").

      The bytes returned form the String.getBytes() method should return the
      original bytes. The program tries to convert the following Big5 characters
      f9d4
      f9d5
      f9d6
      f9d7
      f9d8
      f9dd
      f9de

      but only the first two are converted back correctly (i.e f9d4 and f9d5) and
      the others fail to convert correctly.

      The bug is failing for customer using JDK 1.3
      However, I can reproduce the problem using both JDK's:
      ladybird JDK 1.3.1-rc1-b19 and merlin JDK 1.4.0-beta-b56.

      Here's the test case:

      public class f2 {
          
          public static String hexStr(byte b) {
              int i= 0xff & b;
              return "0x" + Integer.toString(i, 16);
          }

          public static void main(String arg[]) {

          try {
              byte inbuf[][] = {
                  { (byte)0xf9, (byte)0xd4 },
                  { (byte)0xf9, (byte)0xd5 },
                  { (byte)0xf9, (byte)0xd6 },
                  { (byte)0xf9, (byte)0xd7 },
                  { (byte)0xf9, (byte)0xd8 },
                  { (byte)0xf9, (byte)0xdd },
                  { (byte)0xf9, (byte)0xde },
              };

              System.out.println("platform encoding = " +
                                  System.getProperty("file.encoding"));

              for (int i=0;i<inbuf.length;i++) {
                  System.out.println("Original bytes :" +
                                     hexStr(inbuf[i][0])+" "+hexStr(inbuf[i][1]));

                  String s = new String(inbuf[i], "BIG5");
                  byte buf[] = s.getBytes("BIG5");
                  System.out.print("Converted bytes:" );
                  for (int b=0;b<buf.length;b++) {
                     System.out.print(hexStr(buf[b])+" ");
                  }
                  System.out.println("");

              }

              } catch (java.io.UnsupportedEncodingException e) {
              System.out.println("ERROR: "+ e);
              }
          }
      }

      Here's the output:

      platform encoding = BIG5
      Original bytes :0xf9 0xd4
      Converted bytes:0xf9 0xd4
      Original bytes :0xf9 0xd5
      Converted bytes:0xf9 0xd5
      Original bytes :0xf9 0xd6
      Converted bytes:0x3f
      Original bytes :0xf9 0xd7
      Converted bytes:0x3f
      Original bytes :0xf9 0xd8
      Converted bytes:0x3f
      Original bytes :0xf9 0xdd
      Converted bytes:0x3f
      Original bytes :0xf9 0xde
      Converted bytes:0x3f

            sherman Xueming Shen
            mchansunw Mei Chan (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated:
              Resolved:
              Imported:
              Indexed: