Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-5058133

iso2022 encoders throw BufferOverflowException

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: P2 P2
    • 5.0
    • 5.0
    • core-libs

      The charsets ISO-20220-KR, ISO-2022-CN-CNS, ISO-2022-CN-GB sometimes
      throw BufferOverflowException, especially when encoding one char.

      For example, the following test program:
      --------------------------------------------------------------
      import java.io.*;
      import java.util.*;
      import java.nio.charset.*;
      import java.nio.*;

      public class FindOneCharEncoderBugs {
          public static void main(String[] args) throws Exception {
      for (Map.Entry<String,Charset> e
      : Charset.availableCharsets().entrySet()) {
      String csn = e.getKey();
      Charset cs = e.getValue();
      int failures = 0;
      System.out.println(csn);
      if (csn.equals("x-IBM933")) continue; // hangs!

      // Ignore decoder-only charsets
      try { cs.newEncoder(); }
      catch (UnsupportedOperationException x) { continue; }

      for (int i = 0; failures < 5 && i <= 0xffff; i++) {
      String s = new String(new char[] { (char)i });
      try {
      s.getBytes(csn);
      } catch (BufferOverflowException x) {
      System.out.printf("Overflow: charset=%s char=%x%n", csn, i);
      failures++;
      i += 100;
      } catch (Throwable t) {
      System.out.printf("%s charset=%s char=%x%n", t, csn, i);
      i += 100;
      }
      }
      }
          }
      }
      --------------------------------------------------------------
      prints (among other things):

      Overflow: charset=ISO-2022-KR char=a1
      Overflow: charset=ISO-2022-KR char=111
      Overflow: charset=ISO-2022-KR char=2c7
      Overflow: charset=ISO-2022-KR char=391
      Overflow: charset=ISO-2022-KR char=401
      Overflow: charset=x-ISO-2022-CN-CNS char=a7
      Overflow: charset=x-ISO-2022-CN-CNS char=2c7
      Overflow: charset=x-ISO-2022-CN-CNS char=391
      Overflow: charset=x-ISO-2022-CN-CNS char=2013
      Overflow: charset=x-ISO-2022-CN-CNS char=2103
      Overflow: charset=x-ISO-2022-CN-GB char=a4
      Overflow: charset=x-ISO-2022-CN-GB char=113
      Overflow: charset=x-ISO-2022-CN-GB char=1ce
      Overflow: charset=x-ISO-2022-CN-GB char=2c7
      Overflow: charset=x-ISO-2022-CN-GB char=391

      It is particularly egregious that strings consisting of one ASCII character
      cannot be correctly encoded.
      ###@###.### 2004-06-06

            martin Martin Buchholz
            martin Martin Buchholz
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated:
              Resolved:
              Imported:
              Indexed: