Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-6974189

Re-open bug #4950409, make GB2312 an alias of GBK

XMLWordPrintable

    • Icon: Enhancement Enhancement
    • Resolution: Won't Fix
    • Icon: P4 P4
    • None
    • 6u21
    • core-libs

      A DESCRIPTION OF THE REQUEST :
      Re-open bug #4950409 which requests that the GB2312 be made an alias of GBK instead of EUC_CN.

      This RFE was (in our opinion) erroneously marked as a duplicate of #4914869 when in fact it describes a different issue.

      JUSTIFICATION :
      GBK is an extension of the GB2312 character encoding and is fully backwards compatible. It allows encoding of additional Chinese characters in comparison to GB2312 and its derivatives.

      As mentioned in the original RFE, many programs (mail clients in particular) use gb2312 as the encoding name when they mean GBK.

      EXPECTED VERSUS ACTUAL BEHAVIOR :
      EXPECTED -
      That "gb2312" (at the very least) be added as an alias to the GBK character set.

      This would result in Charset.forName("gb2312") returning an instance of the GBK Charset.

      This would allow Java programs to correctly decode data encoded using GBK which list "gb2312" as their encoding (which as the original submitter remarked appears to be common practice)
      ACTUAL -
      Charset.forName("gb2312") returns the EUC_CN charset, which leads to "unmappable" error characters when decoding Chinese text which is marked as having been encoded using gb2312 when in fact it contains GBK encoding.

      CUSTOMER SUBMITTED WORKAROUND :
      Only one, which is to manually modify the charsets.jar located in the JRE's /lib director, replacing the existing EUC_CN.class with a modified one which contains the following code:

      public class EUC_CN extends GBK {

      }

      This has the effect of replacing the EUC_CN charset with the GBK charset, which ,as the latter is backwards compatible with the former, should not be a problem.

      This is a very ugly hack, but seems to be the only workaround that works, as these mappings are very much hardcoded into the JRE.

            sherman Xueming Shen
            webbuggrp Webbug Group
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated:
              Resolved:
              Imported:
              Indexed: