Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-6447475

charsets.jar is at least 1.1MB bigger than it should be.

XMLWordPrintable

      A set of double-byte charsets in ExtendedCharsets package (sun.nio.cs.ext) has the
      following similar implementation model

      public class CharsetXYZ extends Charset {
      ...
          public String getDecoderIndex2() {
              return Decoder.index2;
          }
          public String getEncoderIndex2() {
              return Encoder.Index2;
          }
          private static class Decoder extends XYZDecoder {
              private final static String index2 = "HUGE STRING CONSTANT 1";
              ...
          }
          private static class Encoder extends XYZEncoder {
              private final static String index2 = "HUGE STRING CONSTANT 2";
              ...
          }
      ...
      }

      The getDecoderIndex2() and getCoderIndex2() are utility methods used to share
      the huge String data with corresponding converter implementation in sun.io package,
      they are supposed to save the space for both runtime and static storage (in jar file)
      when two implementations (sun.nio.cs.ext and sun.io) share the same data. However,
      the above implemention model has a loophole that total fails the expectation, since
      the De/Encoder.index2 is a "final" and "static" String, the javac will make a
      copy instead of using the reference of De/Encoder.index2 into CharsetXYZ.class,
      the result is the size of the supposedly lightweight class CharsetXYZ.class becomes
      unreasonable huge (including two huge copies of De/Encoder.index2).

      Below is the list of the charsets in charsets.jar that has the overweight
      charset class size.

      Either to remove the keyword "final" from the String constant declaration or
      to "reorg" the declaration as

      private final static String index2;
      static {
          index2 = "HUGE STRING CONSTNT2"
      }

      yields a surprising 1.1MB decrease in size out of the 4.6MB charsets.jar.

         297551 EUC_TW$Decoder.class
         486384 EUC_TW$Encoder.class
         468722 EUC_TW.class

          47562 IBM1381$Decoder.class
          67448 IBM1381$Encoder.class
          80838 IBM1381.class

          27037 IBM1383$Decoder.class
          65805 IBM1383$Encoder.class
          76730 IBM1383.class

          57450 IBM33722$Decoder.class
         113585 IBM33722$Encoder.class
         151495 IBM33722.class

          45223 IBM930$Decoder.class
          77387 IBM930$Encoder.class
          98091 IBM930.class

          91826 IBM933$Decoder.class
         130404 IBM933$Encoder.class
          59788 IBM933.class

          39135 IBM935$Decoder.class
          67545 IBM935$Encoder.class
          82161 IBM935.class

          72513 IBM937$Decoder.class
          85852 IBM937$Encoder.class
         142030 IBM937.class

          45223 IBM939$Decoder.class
          77386 IBM939$Encoder.class
          98090 IBM939.class

          38683 IBM942$Decoder.class
          69778 IBM942$Encoder.class
          82680 IBM942.class
          17292 IBM942C$Encoder.class

          39149 IBM943$Decoder.class
          68398 IBM943$Encoder.class
          90800 IBM943.class
          25827 IBM943C$Encoder.class

          74150 IBM948$Decoder.class
          85592 IBM948$Encoder.class
         141632 IBM948.class

          80623 IBM950$Decoder.class
          85592 IBM950$Encoder.class
         139913 IBM950.class

         110046 IBM964$Decoder.class
         156843 IBM964$Encoder.class
         255762 IBM964.class

          27011 IBM970$Decoder.class
         121104 IBM970$Encoder.class
          77734 IBM970.class

            sherman Xueming Shen
            sherman Xueming Shen
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved:
              Imported:
              Indexed: