-
Bug
-
Resolution: Not an Issue
-
P4
-
None
-
8, 11, 17, 22, 23, 24
-
x86_64
-
windows
ADDITIONAL SYSTEM INFORMATION :
$ java -version
java version "22.0.1" 2024-04-16
Java(TM) SE Runtime Environment (build 22.0.1+8-16)
Java HotSpot(TM) 64-Bit Server VM (build 22.0.1+8-16, mixed mode, sharing)
Windows 10, Windows 11
Also in Java 17 and probably in any other Java older than 22.0.1
A DESCRIPTION OF THE PROBLEM :
If you run Java in Git Bash (MINGW64 based environment) it uses a wrong Charset in the System.out.
Git is a de-facto standard now and when you install it on Windows it comes with a UNIX-like Git Bash environment, based on MINGW64. Many Java developers use this Git Bash environment to build their code by Maven and Gradle and for running other R&D related things, including Java based tools.
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
1. USe the Jansi library
2. Run it in the CMD window by a command like java -jar jansi-2.4.1.jar
3. Run it again in the Git Bash (MINGW64) window by a command like java -jar jansi-2.4.1.jar
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
Both step 2 and step 3 should print a nice pseudo-graphics logo of the JANSI library, at the end of the demonstation.
ACTUAL -
Step 2 works properly
Step 3 prints question marks only instead, because System.out uses a wrong cp1252 encoding instead of UTF-8 one.
How I know what encoding is right in MINGW64? From either the "locale" command output or from the LC_* environment variables:
$ locale
LANG=
LC_CTYPE="en_GB.UTF-8"
LC_NUMERIC="C.UTF-8"
LC_TIME="C.UTF-8"
LC_COLLATE="C.UTF-8"
LC_MONETARY="C.UTF-8"
LC_MESSAGES="C.UTF-8"
LC_ALL=
$ echo $LC_CTYPE
en_GB.UTF-8
---------- BEGIN SOURCE ----------
A similar code used by the jansi-2.4.1.jar when jansi-2.4.1.jar is in your classpath:
import org.fusesource.jansi.AnsiMain;
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.nio.charset.StandardCharsets;
public class TestClass {
public static void main(String[] args) {
try (BufferedReader in = new BufferedReader(new InputStreamReader(AnsiMain.class.getResourceAsStream("jansi.txt"), StandardCharsets.UTF_8))) {
for (String line = in.readLine(); line != null; line = in.readLine()) {
System.out.println(line);
}
} catch (IOException e) {
// ignore
}
}
}
You may also extract the jansi.txt file from the jansi-2.4.1.jar and work with it directly without the jansi-2.4.1.jar library in the classpath anymore.
---------- END SOURCE ----------
CUSTOMER SUBMITTED WORKAROUND :
import org.fusesource.jansi.AnsiMain;
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.nio.charset.Charset;
import java.nio.charset.StandardCharsets;
public class TestClass {
public static void main(String[] args) {
try (BufferedReader in = new BufferedReader(new InputStreamReader(AnsiMain.class.getResourceAsStream("jansi.txt"), StandardCharsets.UTF_8))) {
if (System.getenv("MSYSTEM") != null && System.getenv("MSYSTEM").startsWith("MINGW")) {
Charset charset = Charset.forName(System.getenv("LC_CTYPE").split("\\.")[1]);
for (String line = in.readLine(); line != null; line = in.readLine()) {
System.out.write(line.getBytes(charset));
System.out.println();
}
} else {
for (String line = in.readLine(); line != null; line = in.readLine()) {
System.out.println(line);
}
}
} catch (IOException e) {
// ignore
}
}
}
FREQUENCY : always
$ java -version
java version "22.0.1" 2024-04-16
Java(TM) SE Runtime Environment (build 22.0.1+8-16)
Java HotSpot(TM) 64-Bit Server VM (build 22.0.1+8-16, mixed mode, sharing)
Windows 10, Windows 11
Also in Java 17 and probably in any other Java older than 22.0.1
A DESCRIPTION OF THE PROBLEM :
If you run Java in Git Bash (MINGW64 based environment) it uses a wrong Charset in the System.out.
Git is a de-facto standard now and when you install it on Windows it comes with a UNIX-like Git Bash environment, based on MINGW64. Many Java developers use this Git Bash environment to build their code by Maven and Gradle and for running other R&D related things, including Java based tools.
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
1. USe the Jansi library
2. Run it in the CMD window by a command like java -jar jansi-2.4.1.jar
3. Run it again in the Git Bash (MINGW64) window by a command like java -jar jansi-2.4.1.jar
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
Both step 2 and step 3 should print a nice pseudo-graphics logo of the JANSI library, at the end of the demonstation.
ACTUAL -
Step 2 works properly
Step 3 prints question marks only instead, because System.out uses a wrong cp1252 encoding instead of UTF-8 one.
How I know what encoding is right in MINGW64? From either the "locale" command output or from the LC_* environment variables:
$ locale
LANG=
LC_CTYPE="en_GB.UTF-8"
LC_NUMERIC="C.UTF-8"
LC_TIME="C.UTF-8"
LC_COLLATE="C.UTF-8"
LC_MONETARY="C.UTF-8"
LC_MESSAGES="C.UTF-8"
LC_ALL=
$ echo $LC_CTYPE
en_GB.UTF-8
---------- BEGIN SOURCE ----------
A similar code used by the jansi-2.4.1.jar when jansi-2.4.1.jar is in your classpath:
import org.fusesource.jansi.AnsiMain;
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.nio.charset.StandardCharsets;
public class TestClass {
public static void main(String[] args) {
try (BufferedReader in = new BufferedReader(new InputStreamReader(AnsiMain.class.getResourceAsStream("jansi.txt"), StandardCharsets.UTF_8))) {
for (String line = in.readLine(); line != null; line = in.readLine()) {
System.out.println(line);
}
} catch (IOException e) {
// ignore
}
}
}
You may also extract the jansi.txt file from the jansi-2.4.1.jar and work with it directly without the jansi-2.4.1.jar library in the classpath anymore.
---------- END SOURCE ----------
CUSTOMER SUBMITTED WORKAROUND :
import org.fusesource.jansi.AnsiMain;
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.nio.charset.Charset;
import java.nio.charset.StandardCharsets;
public class TestClass {
public static void main(String[] args) {
try (BufferedReader in = new BufferedReader(new InputStreamReader(AnsiMain.class.getResourceAsStream("jansi.txt"), StandardCharsets.UTF_8))) {
if (System.getenv("MSYSTEM") != null && System.getenv("MSYSTEM").startsWith("MINGW")) {
Charset charset = Charset.forName(System.getenv("LC_CTYPE").split("\\.")[1]);
for (String line = in.readLine(); line != null; line = in.readLine()) {
System.out.write(line.getBytes(charset));
System.out.println();
}
} else {
for (String line = in.readLine(); line != null; line = in.readLine()) {
System.out.println(line);
}
}
} catch (IOException e) {
// ignore
}
}
}
FREQUENCY : always