-
Bug
-
Resolution: Fixed
-
P4
-
18, 19
-
b20
-
generic
-
generic
-
Verified
ADDITIONAL SYSTEM INFORMATION :
Windows 11 10.0.22000
openjdk version "18" 2022-03-22
OpenJDK Runtime Environment (build 18+36-2087)
OpenJDK 64-Bit Server VM (build 18+36-2087, mixed mode, sharing)
A DESCRIPTION OF THE PROBLEM :
System.out's Javadoc states the following:
The encoding used in the conversion from characters to bytes is equivalent to Console.charset() if the Console exists, Charset.defaultCharset() otherwise.
When there is a Console, this is correct. However, when there isn't a Console, e.g. when redirecting output to a file, System.out now (in JDK 18) uses `native.encoding` rather than the result of calling Charset.defaultCharset(), which is affected by `file.encoding`. You used to be able to control the output of a program in prior JDKs using `file.encoding` because the semantics stated by the Javadoc were correct. Now, you cannot set `native.encoding`, and `sun.stdout.encoding` is an undocumented feature, so it cannot be officially changed any more.
In my opinion, the correct fix is to use `native.encoding` only when `file.encoding` is not specified, which retains the output behavior of JDK 17 and below regardless of if `file.encoding` is specified, and update the Javadoc to reflect this.
I am willing to make a PR to fix this whichever way is preferred.
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Reproduction steps made on Linux, but can be adapted to other OSes:
1. Compile the source code attached.
2. Run `java --add-opens=java.base/java.io=ALL-UNNAMED Test >test.txt`
3. Inspect test.txt to see it states the following (Windows shows a different System.out):
console: null
'default' charset: UTF-8
System.out: UTF-8
4. Try changing the 'default' charset and therefore what should be used by System.out according to the Javadoc. Run `java -Dfile.encoding=Cp1252 --add-opens=java.base/java.io=ALL-UNNAMED Test >test.txt`:
console: null
'default' charset: windows-1252
System.out: UTF-8
6. Notice how the System.out does not change, despite the default charset change.
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
System.out should change with file.encoding when a Console is not present, as documented by the Javadoc.
ACTUAL -
See reproduction steps, especially #6.
---------- BEGIN SOURCE ----------
public class Test {
public static void main(String[] args) throws Throwable {
System.out.println("console: " + System.console()); // Show if the console was present
if (System.console() != null) System.out.println("console charset: " + System.console().charset()); // Show the console's charset
System.out.println("'default' charset: " + java.nio.charset.Charset.defaultCharset()); // Show the "default" charset
var charsetField = System.out.getClass().getDeclaredField("charset");
charsetField.setAccessible(true);
System.out.println("System.out: " + charsetField.get(System.out)); // Show the charset used by System.out
}
}
---------- END SOURCE ----------
CUSTOMER SUBMITTED WORKAROUND :
Use sun.stdout.encoding, an undocumented and unsupported property.
FREQUENCY : always
Windows 11 10.0.22000
openjdk version "18" 2022-03-22
OpenJDK Runtime Environment (build 18+36-2087)
OpenJDK 64-Bit Server VM (build 18+36-2087, mixed mode, sharing)
A DESCRIPTION OF THE PROBLEM :
System.out's Javadoc states the following:
The encoding used in the conversion from characters to bytes is equivalent to Console.charset() if the Console exists, Charset.defaultCharset() otherwise.
When there is a Console, this is correct. However, when there isn't a Console, e.g. when redirecting output to a file, System.out now (in JDK 18) uses `native.encoding` rather than the result of calling Charset.defaultCharset(), which is affected by `file.encoding`. You used to be able to control the output of a program in prior JDKs using `file.encoding` because the semantics stated by the Javadoc were correct. Now, you cannot set `native.encoding`, and `sun.stdout.encoding` is an undocumented feature, so it cannot be officially changed any more.
In my opinion, the correct fix is to use `native.encoding` only when `file.encoding` is not specified, which retains the output behavior of JDK 17 and below regardless of if `file.encoding` is specified, and update the Javadoc to reflect this.
I am willing to make a PR to fix this whichever way is preferred.
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Reproduction steps made on Linux, but can be adapted to other OSes:
1. Compile the source code attached.
2. Run `java --add-opens=java.base/java.io=ALL-UNNAMED Test >test.txt`
3. Inspect test.txt to see it states the following (Windows shows a different System.out):
console: null
'default' charset: UTF-8
System.out: UTF-8
4. Try changing the 'default' charset and therefore what should be used by System.out according to the Javadoc. Run `java -Dfile.encoding=Cp1252 --add-opens=java.base/java.io=ALL-UNNAMED Test >test.txt`:
console: null
'default' charset: windows-1252
System.out: UTF-8
6. Notice how the System.out does not change, despite the default charset change.
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
System.out should change with file.encoding when a Console is not present, as documented by the Javadoc.
ACTUAL -
See reproduction steps, especially #6.
---------- BEGIN SOURCE ----------
public class Test {
public static void main(String[] args) throws Throwable {
System.out.println("console: " + System.console()); // Show if the console was present
if (System.console() != null) System.out.println("console charset: " + System.console().charset()); // Show the console's charset
System.out.println("'default' charset: " + java.nio.charset.Charset.defaultCharset()); // Show the "default" charset
var charsetField = System.out.getClass().getDeclaredField("charset");
charsetField.setAccessible(true);
System.out.println("System.out: " + charsetField.get(System.out)); // Show the charset used by System.out
}
}
---------- END SOURCE ----------
CUSTOMER SUBMITTED WORKAROUND :
Use sun.stdout.encoding, an undocumented and unsupported property.
FREQUENCY : always
- csr for
-
JDK-8284778 System.out does not use the encoding/charset specified in the Javadoc
- Closed
- relates to
-
JDK-8293957 Document new system properties stdout.encoding and stderr.encoding in Internationalization Guide
- Resolved
-
JDK-8187041 JEP 400: UTF-8 by Default
- Closed
-
JDK-8294940 Backport JDK-8283620 to JDK 18
- Closed
(1 links to)