Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8242541

Small charset issues (ISO8859-16, x-eucJP-Open, x-IBM834 and x-IBM949C)

XMLWordPrintable

        Found small charset issues
        * Missing historical name alias in ISO8859-16
        * Typo hisname on x-eucJP-Open
        * x-IBM834 and x-IBM949C charset source codes should be template style

        Detail information is as follows:

        Missing historical name alias in ISO-8859-16

        java.io.InputStreamReader.getEncoding() returns historical name on Charset if sun.nio.cs.HistoricallyNamedCharset interface is implemented.
        But historical name on ISO-8859-16 charset is not defined as its alias.

        ======
        $ cat HistNameTest.java
        import java.nio.charset.*;
        import java.io.*;


        public class HistNameTest {
            public static void main(String[] args) throws Exception {
                for(Charset cs : Charset.availableCharsets().values()) {
                    String enc = (new InputStreamReader(System.in, cs)).getEncoding();
                    try {
                        if (!cs.equals(Charset.forName(enc)))
                            System.err.println(cs.name()+"<>"+enc);
                    } catch (Exception e) {
                        System.err.println(cs.name());
                        e.printStackTrace();
                    }
                }
            }
        }
        $ ~/jdk-15.jdk/Contents/Home/bin/java HistNameTest.java
        ISO-8859-16
        java.nio.charset.UnsupportedCharsetException: ISO8859_16
            at java.base/java.nio.charset.Charset.forName(Charset.java:526)
            at HistNameTest.main(HistNameTest.java:9)
        ...
        ======

        Typo hisname on x-eucJP-Open

        According to make/data/charsetmapping/charsets,
        Hisname on x-eucJP-Open is not valid, should be "EUC_JP_Solari"
        ======
        charset x-eucJP-Open EUC_JP_Open
            package sun.nio.cs.ext
            type    template
            hisname EUC_JP_Solari
            ascii   true
            alias   EUC_JP_Solaris       # JDK historical
            alias   eucJP-open
        ======

        But this hisname is not used.
        According to src/jdk.charsets/share/classes/sun/nio/cs/ext/EUC_JP_Open.java.template,
        Historical name is hard coded, but this typo should be fixed.
        ======
            public EUC_JP_Open() {
                super("x-eucJP-Open", $ALIASES$);
            }

             public String historicalName() {
                return "EUC_JP_Solaris";
            }
        ======

        x-IBM834 and x-IBM949C charset source codes should be template style

        According to make/data/charsetmapping/charsets, 
        x-IBM834 and x-IBM949C's type are "source"
        ======
        charset x-IBM834 IBM834 # EBCDIC DBCS-only Korean
            package sun.nio.cs.ext
            type    source
        ...
        charset x-IBM949C IBM949C
            package sun.nio.cs.ext
            type    source
        ======

        According to IBM834.java, it refers IBM933 class.
        src/jdk.charsets/share/classes/sun/nio/cs/ext/IBM834.java
        ======
            public CharsetDecoder newDecoder() {
                IBM933.initb2c();
                return new DoubleByte.Decoder_DBCSONLY(
                    this, IBM933.b2c, null, 0x40, 0xfe);  // hardcode the b2min/max
            }
        ======

        According to IBM949C.java, it refers IBM949 class.
        src/jdk.charsets/share/classes/sun/nio/cs/ext/IBM949C.java
        ======
            public CharsetDecoder newDecoder() {
                return new DoubleByte.Decoder(this,
                                              IBM949.b2c,
                                              b2cSB,
                                              0xa1,
                                              0xfe);
            }
        ======

        According to make/data/charsetmapping/charsets, 
        IBM933 and IBM949 are not "source" type.
        ======
        charset x-IBM933 IBM933
            package sun.nio.cs.ext
            type    ebcdic
        ...
        charset x-IBM949 IBM949
            package sun.nio.cs.ext
            type    dbcs
        ======

        They can be moved to sun.nio.cs package via make/data/charsetmapping/stdcs-* file.
        Then IBM834 and IBM949C cannot move to sun.nio.cs package if type is "source".
        So their source code should be template style.

              itakiguchi Ichiroh Takiguchi
              itakiguchi Ichiroh Takiguchi
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated:
                Resolved: