charset EUC_TW is 12.6% of the total size of charsets.jar

XMLWordPrintable

    • Type: Enhancement
    • Resolution: Fixed
    • Priority: P4
    • 7
    • Affects Version/s: 7
    • Component/s: core-libs
    • None

      JDK7 b55

      charsets.jar: 6239629 (un-compressed)
      EUC_TW.class 2313
      EUC_TW$Decoder.class 298066
      EUC_TW$Encoder.class 486890

      EUC_TW has total of 55446 codepoints, including supplementary characters in u+20000-u+30000 area. The existing data structure (stored the supplementary character in surrogate form) and implementation obviously takes too much space.

            Assignee:
            Xueming Shen
            Reporter:
            Xueming Shen
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated:
              Resolved:
              Imported:
              Indexed: