Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8283620

System.out does not use the encoding/charset specified in the Javadoc

XMLWordPrintable

    • b20
    • generic
    • generic
    • Verified

      ADDITIONAL SYSTEM INFORMATION :
      Windows 11 10.0.22000

      openjdk version "18" 2022-03-22
      OpenJDK Runtime Environment (build 18+36-2087)
      OpenJDK 64-Bit Server VM (build 18+36-2087, mixed mode, sharing)

      A DESCRIPTION OF THE PROBLEM :
      System.out's Javadoc states the following:
      The encoding used in the conversion from characters to bytes is equivalent to Console.charset() if the Console exists, Charset.defaultCharset() otherwise.

      When there is a Console, this is correct. However, when there isn't a Console, e.g. when redirecting output to a file, System.out now (in JDK 18) uses `native.encoding` rather than the result of calling Charset.defaultCharset(), which is affected by `file.encoding`. You used to be able to control the output of a program in prior JDKs using `file.encoding` because the semantics stated by the Javadoc were correct. Now, you cannot set `native.encoding`, and `sun.stdout.encoding` is an undocumented feature, so it cannot be officially changed any more.

      In my opinion, the correct fix is to use `native.encoding` only when `file.encoding` is not specified, which retains the output behavior of JDK 17 and below regardless of if `file.encoding` is specified, and update the Javadoc to reflect this.

      I am willing to make a PR to fix this whichever way is preferred.


      STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
      Reproduction steps made on Linux, but can be adapted to other OSes:
      1. Compile the source code attached.
      2. Run `java --add-opens=java.base/java.io=ALL-UNNAMED Test >test.txt`
      3. Inspect test.txt to see it states the following (Windows shows a different System.out):
      console: null
      'default' charset: UTF-8
      System.out: UTF-8
      4. Try changing the 'default' charset and therefore what should be used by System.out according to the Javadoc. Run `java -Dfile.encoding=Cp1252 --add-opens=java.base/java.io=ALL-UNNAMED Test >test.txt`:
      console: null
      'default' charset: windows-1252
      System.out: UTF-8

      6. Notice how the System.out does not change, despite the default charset change.

      EXPECTED VERSUS ACTUAL BEHAVIOR :
      EXPECTED -
      System.out should change with file.encoding when a Console is not present, as documented by the Javadoc.
      ACTUAL -
      See reproduction steps, especially #6.

      ---------- BEGIN SOURCE ----------
      public class Test {
        public static void main(String[] args) throws Throwable {
          System.out.println("console: " + System.console()); // Show if the console was present
          if (System.console() != null) System.out.println("console charset: " + System.console().charset()); // Show the console's charset
          System.out.println("'default' charset: " + java.nio.charset.Charset.defaultCharset()); // Show the "default" charset
          var charsetField = System.out.getClass().getDeclaredField("charset");
          charsetField.setAccessible(true);
          System.out.println("System.out: " + charsetField.get(System.out)); // Show the charset used by System.out
        }
      }
      ---------- END SOURCE ----------

      CUSTOMER SUBMITTED WORKAROUND :
      Use sun.stdout.encoding, an undocumented and unsupported property.

      FREQUENCY : always


            naoto Naoto Sato
            webbuggrp Webbug Group
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

              Created:
              Updated:
              Resolved: