Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8202329

Codepage mappings for IBM-943 and Big5 (aix)

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: P3 P3
    • 11
    • 11
    • core-libs
    • b22
    • aix
    • Verified

        Reported by Bhaktavatsal R Maram at http://mail.openjdk.java.net/pipermail/core-libs-dev/2018-April/052799.html

        This issue is continuation to bug 8201540 (Extend the set of supported charsets in java.base on AIX) in which we have moved default charsets of most of the locales supported by Operating System to java.base module thus enabling OpenJDK on those locales for AIX platform.

        As part of that, charsets for locales Ja_JP (IBM-943) and Zh_TW (big5) also have been moved. However, corresponding charsets mapped in Java is not correct for them on AIX. Following are the details:

        1. IBM-943 [1] for locale Ja_JP should be mapped to IBM-943C [2]

        Fundamental difference between IBM-943 and IBM-943C is that IBM-943C is ASCII compatible which means code points 'yen' and 'overline' of IBM-943 is replaced with 'backslash' and 'tilde' from ASCII character set.


        2. Big5 for locale Zh_TW should be mapped to IBM-950 [3]

        I've attached simple test program to print the default charset along with fix for this issue. When run test program (PrintDefaultCharset) with IBM JDK 8 (on AIX) for locales Ja_JP & Zh_TW, following is output.

        -bash-4.4$ LANG=Ja_JP ~/JDKs/IBM/80/ON/sdk/jre/bin/java PrintDefaultCharset
        LANG = Ja_JP
        Default charset = x-IBM943C
        file.encoding = IBM-943C
        sun.jnu.encoding = IBM-943C

        -bash-4.4$ LANG=Zh_TW ~/JDKs/IBM/80/ON/sdk/jre/bin/java PrintDefaultCharset
        LANG = Zh_TW
        Default charset = x-IBM950
        file.encoding = IBM-950
        sun.jnu.encoding = IBM-950


        Same test run with openJDK 11 gives following output

        -bash-4.4$ LANG=Ja_JP ~/jdk/bin/java PrintDefaultCharset
        LANG = Ja_JP
        Default charset = x-IBM943
        file.encoding = IBM-943
        sun.jnu.encoding = IBM-943

        -bash-4.4$ LANG=Zh_TW ~/jdk/bin/java PrintDefaultCharset
        LANG = Zh_TW
        Default charset = Big5
        file.encoding = big5
        sun.jnu.encoding = big5

        Following is the test program to reproduce the problem:

        import java.nio.charset.*;
        class PrintDefaultCharset {
             public static void main(String[] args) {
                System.out.println("LANG = "+System.getenv("LANG"));
                System.out.println("Default charset = "+Charset.defaultCharset().name());
                System.out.println("file.encoding = "+System.getProperty("file.encoding"));
                System.out.println("sun.jnu.encoding = "+System.getProperty("sun.jnu.encoding"));
             }
        }

              simonis Volker Simonis
              simonis Volker Simonis
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: