Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8242541

Small charset issues (ISO8859-16, x-eucJP-Open, x-IBM834 and x-IBM949C)

    XMLWordPrintable

Details

    Backports

      Description

        Found small charset issues
        * Missing historical name alias in ISO8859-16
        * Typo hisname on x-eucJP-Open
        * x-IBM834 and x-IBM949C charset source codes should be template style

        Detail information is as follows:

        Missing historical name alias in ISO-8859-16

        java.io.InputStreamReader.getEncoding() returns historical name on Charset if sun.nio.cs.HistoricallyNamedCharset interface is implemented.
        But historical name on ISO-8859-16 charset is not defined as its alias.

        ======
        $ cat HistNameTest.java
        import java.nio.charset.*;
        import java.io.*;


        public class HistNameTest {
            public static void main(String[] args) throws Exception {
                for(Charset cs : Charset.availableCharsets().values()) {
                    String enc = (new InputStreamReader(System.in, cs)).getEncoding();
                    try {
                        if (!cs.equals(Charset.forName(enc)))
                            System.err.println(cs.name()+"<>"+enc);
                    } catch (Exception e) {
                        System.err.println(cs.name());
                        e.printStackTrace();
                    }
                }
            }
        }
        $ ~/jdk-15.jdk/Contents/Home/bin/java HistNameTest.java
        ISO-8859-16
        java.nio.charset.UnsupportedCharsetException: ISO8859_16
            at java.base/java.nio.charset.Charset.forName(Charset.java:526)
            at HistNameTest.main(HistNameTest.java:9)
        ...
        ======

        Typo hisname on x-eucJP-Open

        According to make/data/charsetmapping/charsets,
        Hisname on x-eucJP-Open is not valid, should be "EUC_JP_Solari"
        ======
        charset x-eucJP-Open EUC_JP_Open
            package sun.nio.cs.ext
            type    template
            hisname EUC_JP_Solari
            ascii   true
            alias   EUC_JP_Solaris       # JDK historical
            alias   eucJP-open
        ======

        But this hisname is not used.
        According to src/jdk.charsets/share/classes/sun/nio/cs/ext/EUC_JP_Open.java.template,
        Historical name is hard coded, but this typo should be fixed.
        ======
            public EUC_JP_Open() {
                super("x-eucJP-Open", $ALIASES$);
            }

             public String historicalName() {
                return "EUC_JP_Solaris";
            }
        ======

        x-IBM834 and x-IBM949C charset source codes should be template style

        According to make/data/charsetmapping/charsets, 
        x-IBM834 and x-IBM949C's type are "source"
        ======
        charset x-IBM834 IBM834 # EBCDIC DBCS-only Korean
            package sun.nio.cs.ext
            type    source
        ...
        charset x-IBM949C IBM949C
            package sun.nio.cs.ext
            type    source
        ======

        According to IBM834.java, it refers IBM933 class.
        src/jdk.charsets/share/classes/sun/nio/cs/ext/IBM834.java
        ======
            public CharsetDecoder newDecoder() {
                IBM933.initb2c();
                return new DoubleByte.Decoder_DBCSONLY(
                    this, IBM933.b2c, null, 0x40, 0xfe);  // hardcode the b2min/max
            }
        ======

        According to IBM949C.java, it refers IBM949 class.
        src/jdk.charsets/share/classes/sun/nio/cs/ext/IBM949C.java
        ======
            public CharsetDecoder newDecoder() {
                return new DoubleByte.Decoder(this,
                                              IBM949.b2c,
                                              b2cSB,
                                              0xa1,
                                              0xfe);
            }
        ======

        According to make/data/charsetmapping/charsets, 
        IBM933 and IBM949 are not "source" type.
        ======
        charset x-IBM933 IBM933
            package sun.nio.cs.ext
            type    ebcdic
        ...
        charset x-IBM949 IBM949
            package sun.nio.cs.ext
            type    dbcs
        ======

        They can be moved to sun.nio.cs package via make/data/charsetmapping/stdcs-* file.
        Then IBM834 and IBM949C cannot move to sun.nio.cs package if type is "source".
        So their source code should be template style.

        Attachments

          Issue Links

            Activity

              People

                itakiguchi Ichiroh Takiguchi
                itakiguchi Ichiroh Takiguchi
                Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                  Created:
                  Updated:
                  Resolved: