Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8337077

Java uses wrong Charset in System.out when running on MINGW

XMLWordPrintable

    • x86_64
    • windows

      ADDITIONAL SYSTEM INFORMATION :
      $ java -version
      java version "22.0.1" 2024-04-16
      Java(TM) SE Runtime Environment (build 22.0.1+8-16)
      Java HotSpot(TM) 64-Bit Server VM (build 22.0.1+8-16, mixed mode, sharing)

      Windows 10, Windows 11

      Also in Java 17 and probably in any other Java older than 22.0.1

      A DESCRIPTION OF THE PROBLEM :
      If you run Java in Git Bash (MINGW64 based environment) it uses a wrong Charset in the System.out.

      Git is a de-facto standard now and when you install it on Windows it comes with a UNIX-like Git Bash environment, based on MINGW64. Many Java developers use this Git Bash environment to build their code by Maven and Gradle and for running other R&D related things, including Java based tools.

      STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
      1. USe the Jansi library
      2. Run it in the CMD window by a command like java -jar jansi-2.4.1.jar
      3. Run it again in the Git Bash (MINGW64) window by a command like java -jar jansi-2.4.1.jar


      EXPECTED VERSUS ACTUAL BEHAVIOR :
      EXPECTED -
      Both step 2 and step 3 should print a nice pseudo-graphics logo of the JANSI library, at the end of the demonstation.
      ACTUAL -
      Step 2 works properly
      Step 3 prints question marks only instead, because System.out uses a wrong cp1252 encoding instead of UTF-8 one.

      How I know what encoding is right in MINGW64? From either the "locale" command output or from the LC_* environment variables:

      $ locale
      LANG=
      LC_CTYPE="en_GB.UTF-8"
      LC_NUMERIC="C.UTF-8"
      LC_TIME="C.UTF-8"
      LC_COLLATE="C.UTF-8"
      LC_MONETARY="C.UTF-8"
      LC_MESSAGES="C.UTF-8"
      LC_ALL=

      $ echo $LC_CTYPE
      en_GB.UTF-8


      ---------- BEGIN SOURCE ----------
      A similar code used by the jansi-2.4.1.jar when jansi-2.4.1.jar is in your classpath:

      import org.fusesource.jansi.AnsiMain;

      import java.io.BufferedReader;
      import java.io.IOException;
      import java.io.InputStreamReader;
      import java.nio.charset.StandardCharsets;

      public class TestClass {
          public static void main(String[] args) {
              try (BufferedReader in = new BufferedReader(new InputStreamReader(AnsiMain.class.getResourceAsStream("jansi.txt"), StandardCharsets.UTF_8))) {
                  for (String line = in.readLine(); line != null; line = in.readLine()) {
                      System.out.println(line);
                  }
              } catch (IOException e) {
                  // ignore
              }
          }
      }

      You may also extract the jansi.txt file from the jansi-2.4.1.jar and work with it directly without the jansi-2.4.1.jar library in the classpath anymore.
      ---------- END SOURCE ----------

      CUSTOMER SUBMITTED WORKAROUND :
      import org.fusesource.jansi.AnsiMain;

      import java.io.BufferedReader;
      import java.io.IOException;
      import java.io.InputStreamReader;
      import java.nio.charset.Charset;
      import java.nio.charset.StandardCharsets;

      public class TestClass {
          public static void main(String[] args) {
              try (BufferedReader in = new BufferedReader(new InputStreamReader(AnsiMain.class.getResourceAsStream("jansi.txt"), StandardCharsets.UTF_8))) {
                  if (System.getenv("MSYSTEM") != null && System.getenv("MSYSTEM").startsWith("MINGW")) {
                      Charset charset = Charset.forName(System.getenv("LC_CTYPE").split("\\.")[1]);
                      for (String line = in.readLine(); line != null; line = in.readLine()) {
                          System.out.write(line.getBytes(charset));
                          System.out.println();
                      }
                  } else {
                      for (String line = in.readLine(); line != null; line = in.readLine()) {
                          System.out.println(line);
                      }
                  }
              } catch (IOException e) {
                  // ignore
              }
          }
      }

      FREQUENCY : always


        1. Capture.PNG
          11 kB
          Naoto Sato
        2. jansi.txt
          0.9 kB
          Andrew Wang
        3. TestClass.java
          0.6 kB
          Andrew Wang

            naoto Naoto Sato
            webbuggrp Webbug Group
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: