Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-6245179

(cs) Encoding a string to a byte[] adds nulls to end of byte array

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Duplicate
    • Icon: P4 P4
    • None
    • 5.0
    • core-libs
    • x86
    • windows_2000

      FULL PRODUCT VERSION :
      java version "1.5.0_02"
      Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_02-b09)
      Java HotSpot(TM) Client VM (build 1.5.0_02-b09, mixed mode, sharing)

      ADDITIONAL OS VERSION INFORMATION :
      Microsoft Windows 2000 [Version 5.00.2195]

      A DESCRIPTION OF THE PROBLEM :
      Encoding a string to a byte array results in the follows. All chars are ascii (in this case the letter 'a') and the encoding is utf-8.

      A string of 1-9 characters results in the same # of bytes.

      A string of 10-19 chars results in a single additional nulls at the end of the byte array.

      A string of 20-29 chars results in two null and so on.

      STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
      Run this simple class below.

      EXPECTED VERSUS ACTUAL BEHAVIOR :
      EXPECTED -
      I expected the # of chars and the # of bytes to be the same.
      ACTUAL -
      Here's some samples:

      String length=10 bytes length=11
        a a a a a a a a a a
       97 97 97 97 97 97 97 97 97 97 0


      String length=11 bytes length=12
        a a a a a a a a a a a
       97 97 97 97 97 97 97 97 97 97 97 0

      String length=19 bytes length=20
        a a a a a a a a a a a a a a a a a a a
       97 97 97 97 97 97 97 97 97 97 97 97 97 97 97 97 97 97 97 0


      String length=20 bytes length=22
        a a a a a a a a a a a a a a a a a a a a
       97 97 97 97 97 97 97 97 97 97 97 97 97 97 97 97 97 97 97 97 0 0




      REPRODUCIBILITY :
      This bug can be reproduced always.

      ---------- BEGIN SOURCE ----------

      import java.nio.*;
      import java.nio.charset.*;

      public class EncodingTest {
          
          public static String createTestString(int len,char charval) {
              StringBuffer sb = new StringBuffer();
              for (int i=0; i < len; i++) {
                  sb.append(charval);
              }
              return sb.toString();
          }
          
          public static void doEncodingTest(int len,char charval) {
              try {
                  String val = createTestString(len,charval);
                  Charset cs = Charset.forName("utf-8");
                  CharsetEncoder cse = cs.newEncoder();
                  char[] chars = val.toCharArray();
                  CharBuffer cb = CharBuffer.wrap(chars);
                  ByteBuffer bb = cse.encode(cb);
                  cse.flush(bb);
                  byte[] bytes = bb.array();
                  if (val.length() != bytes.length) {
                      System.out.println("String length=" + val.length() + " bytes length=" + bytes.length);
                      for (int i = 0; i < val.length(); i++) {
                          System.out.print(" " + val.charAt(i));
                      }
                      System.out.println("");
                      for (int i = 0; i < bytes.length; i++) {
                          System.out.print(" " + bytes[i]);
                      }
                      System.out.println("");
                  }
              } catch(CharacterCodingException e) {
                  System.out.println(e.toString());
              }
              System.out.println("");
          }
          
          public static void doEncodingTests(int max,char charval) {
              for (int i = 1; i <= max; i++) {
                  doEncodingTest(i,charval);
                  System.out.println("");
              }
          } public static void main(String[] args) {
              doEncodingTests(30,'a');
          }
          
      }
      ---------- END SOURCE ----------

      CUSTOMER SUBMITTED WORKAROUND :
      I strip null chars off the end, which adds overhead. The above case is a gross
      simplification meant to show the problem.
      ###@###.### 2005-03-23 22:52:14 GMT

            iris Iris Clark
            gmanwanisunw Girish Manwani (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated:
              Resolved:
              Imported:
              Indexed: