Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-4261449

Code page Cp1046 should be Cp1047

XMLWordPrintable

    • kestrel
    • generic
    • generic



      Name: pa48320 Date: 08/10/99



      docs/guide/intl/encoding.doc.html (for 1.1.7 and 1.2.2)
      says Cp1046 is "IBM Open Edition US EBCDIC"
      Cp1046 is supposed to be an Arabic code page;
      see http://www.austin.ibm.com/doc_link/en_US/a_doc_lib/aixkybd/kybdtech/Arabic2.htm
      Cp1047 is supposed to be IBM US EBCDIC;
      see http://www.s390.ibm.com/products/oe/bpxqs11.html
      Cp500 does appear to be an EBCDIC encoding, but
      I don't know if it matches IBM US EBCDIC.
      Running the following program shows
      Cp1046 does not yield the expected EBCDIC
      encoding of the byte sequence
      0X81, 0X91, 0XA2, 0XC1, 0XD1, 0XE2. Cp1047
      is not a valid encoding name, and Cp500 and Cp037
      do encode EBCDIC bytes.

      Run with:
      java encoding Cp1046 Cp1047 Cp037

      public class ebcdic {
          static String expected = "ajsAJS";
          public static void main(String argv[])
          {
              byte bytes[] = new byte[] {
                  (byte) 0x81, // 'a'
                  (byte) 0x91, // 'j'
                  (byte) 0xa2, // 's'
                  (byte) 0xc1, // 'A'
                  (byte) 0xD1, // 'J'
                  (byte) 0xE2, // 'S'
              };
              for (int i = 0; i < argv.length; i++)
                  display(bytes, argv[i]);
          }
          static void display(byte bytes[], String encoding)
          {
              try {
                  System.out.println("Encoding: " + encoding);
                  String actual = new String(bytes, encoding);
                  System.out.println(actual);
                  for (int i = 0; i < expected.length(); i++)
                  {
                      int e = expected.charAt(i);
                      int a = actual.charAt(i);
                      System.out.print("At char " + i +
                                       ", expect '" + e +
                                       "', found '" + a + "'");
                      if (e != a)
                          System.out.println(" -- Difference!");
                      else
                          System.out.println();
                  }
              }
              catch(Exception e)
              {
                  System.out.println("No such encoding: " + encoding);
              }
          }
      }


      Getting these names correct is important when a Java
      program is communicating with a remote system which
      is running EBCDIC... when a handshake occurs and the
      remote machine running Cp1047 should be able to
      pass the code page Cp1047 back to the Java program,
      not mask the tru code page with a replacement like
      Cp037 or Cp500

      It is not clear if the I18N guide is in error
      (Review ID: 93744)
      ======================================================================

            sherman Xueming Shen
            pallenba Peter Allenbach (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated:
              Resolved:
              Imported:
              Indexed: