Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8284778

System.out does not use the encoding/charset specified in the Javadoc

XMLWordPrintable

    • Icon: CSR CSR
    • Resolution: Approved
    • Icon: P4 P4
    • 19
    • core-libs
    • None
    • behavioral
    • minimal
    • This change exposes two new system properties to set or get the stdout/stderr encoding, no compatibility impact.
    • Java API, System or security property
    • SE

      Summary

      Add new system properties to set or get the encoding of the standard streams (System.out and System.err).

      Problem

      After JEP 400 has been implemented, there are two issues exist wrt the encoding names of System.out/err, ie.

      • In the javadoc, they fallback to Charset.defaultCharset() if there is no console, which in fact was not correct. It should have been "fallback to native.encoding".
      • There used to be a way to override the encoding via a command-line option by setting sun.stdout/err.encoding, these properties were never documented.

      Since the default encoding and default console encoding may differ after JEP 400, there should be some mitigation to override the default encoding for System.out/err.

      Solution

      Add new system properties to set or get the encoding of the standard streams, it's equivalent to promoting the existing sun.stdout/err.encoding system properties to be standard properties with new names.

      The default values to the properties are derived in a platform dependent way, or native.encoding if the platform does not provide streams for the console. The properties can be set on the launcher's command line option with -D. The only support value of the system properties is "UTF-8".

      Specification

      Change the field description of java.lang.System#out as:

             * specified by the host environment or user. The encoding used
             * in the conversion from characters to bytes is equivalent to
             * {@link Console#charset()} if the {@code Console} exists,
      -      * {@link Charset#defaultCharset()} otherwise.
      +      * <a href="#stdout.encoding">stdout.encoding</a> otherwise.
             * <p>
             * For simple stand-alone Java applications, a typical way to write
             * a line of output data is:
             * <blockquote><pre>
             *     System.out.println(data)
      @@ -153,11 +153,11 @@
             * @see     java.io.PrintStream#println(int)
             * @see     java.io.PrintStream#println(long)
             * @see     java.io.PrintStream#println(java.lang.Object)
             * @see     java.io.PrintStream#println(java.lang.String)
             * @see     Console#charset()
      -      * @see     Charset#defaultCharset()
      +      * @see     <a href="#stdout.encoding">stdout.encoding</a>
             */

      Change the field description of java.lang.System#err as:

             * The encoding used in the conversion from characters to bytes is
             * equivalent to {@link Console#charset()} if the {@code Console}
      -      * exists, {@link Charset#defaultCharset()} otherwise.
      +      * exists, <a href="#stderr.encoding">stderr.encoding</a> otherwise.
             *
             * @see     Console#charset()
      -      * @see     Charset#defaultCharset()
      +      * @see     <a href="#stderr.encoding">stderr.encoding</a>
             */

      Append the following two rows in standard properties chart in the method description of System#getProperties() method:

      +      * <tr><th scope="row">{@systemProperty stdout.encoding}</th>
      +      *     <td>Character encoding name for {@link System#out System.out}.
      +      *     The Java runtime can be started with the system property set to {@code UTF-8},
      +      *     starting it with the property set to another value leads to undefined behavior.
      +      * <tr><th scope="row">{@systemProperty stderr.encoding}</th>
      +      *     <td>Character encoding name for {@link System#err System.err}.
      +      *     The Java runtime can be started with the system property set to {@code UTF-8},
      +      *     starting it with the property set to another value leads to undefined behavior.

            naoto Naoto Sato
            webbuggrp Webbug Group
            Alan Bateman
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: