Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8354910

Output by java.io.IO or System.console() corrupted for some non-ASCII characters

XMLWordPrintable

    • generic
    • generic

      ADDITIONAL SYSTEM INFORMATION :
      -----
      Java 25 (1)

      openjdk 25-ea 2025-09-16
      OpenJDK Runtime Environment (build 25-ea+18-2118)
      OpenJDK 64-Bit Server VM (build 25-ea+18-2118, mixed mode, sharing)

      Windows 11 23H2

      -----
      Java 25(2)

      openjdk 25-ea 2025-09-16
      OpenJDK Runtime Environment (build 25-ea+18-2118)
      OpenJDK 64-Bit Server VM (build 25-ea+18-2118, mixed mode, sharing)

      Ubuntu 24.04 in WSL

      -----
      Java 21 (1)

      openjdk 21.0.3 2024-04-16 LTS
      OpenJDK Runtime Environment Temurin-21.0.3+9 (build 21.0.3+9-LTS)
      OpenJDK 64-Bit Server VM Temurin-21.0.3+9 (build 21.0.3+9-LTS, mixed mode, sharing)

      Windows 11 23H2

      ------
      Java 21 (2)

      openjdk 21.0.6 2025-01-21
      OpenJDK Runtime Environment (build 21.0.6+7-Ubuntu-124.04.1)
      OpenJDK 64-Bit Server VM (build 21.0.6+7-Ubuntu-124.04.1, mixed mode, sharing)

      Ubuntu 24.04.2

      A DESCRIPTION OF THE PROBLEM :
      Outputting methods in java.io.IO or System.console() like IO.println and System.console.println() have a bug that they mistakenly replace the higher 8-bits of non-ASCII UTF-16 code units with 0xFF when the lower 8-bits of output code units are 0x80 or greater.

      This bug upsets non-English Java learners using JShell and brings the purpose of the JEPs which introduce java.io.IO to nothing.

      System.console().printf in Java 21 has the same bug in Windows and Ubuntu. (System.console() is null in Java 17 (in Windows) and can't be tested)
      System.out.println doesn't have this bug.

      STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
      1. Launch jshell --enable-preview in Windows Terminal or VS Code Terminal
      2. Type and Run:

      IO.println("¥100");
      IO.println("1枚");
      IO.println("1日");
      IO.println("1斤");
      IO.println("1人");
      IO.println("人");
      IO.println("亀");
      IO.println("ドラえもん");
      IO.println("哆啦A梦");
      IO.println("\u2180");
      IO.println("\u2280");
      IO.println("\u2281");

      Note:

      1. You can replace IO.println with System.console().println.
      2. You can reproduce this bug without --enable-preview if you replace IO.println with System.console().printf.


      EXPECTED VERSUS ACTUAL BEHAVIOR :
      EXPECTED -
      The argument strings are output as are
      ACTUAL -
      IO.println("¥100");
      ᆬ100

      jshell> IO.println("1枚");
      1レ

      jshell> IO.println("1日");
      1¥

      jshell> IO.println("1斤");
      1ᄂ

      jshell> IO.println("1人");
      1ᄎ

      jshell> IO.println("人");


      jshell> IO.println("亀");


      jshell> IO.println("ドラえもん");
      ￉←えツモ

      jshell> IO.println("哆啦A梦");
      ᅥ啦Aᆭ

      jshell> IO.println("\u2180");


      jshell> IO.println("\u2280");


      jshell> IO.println("\u2281");


      Note: System.console().{println,printf} output the same corrupted strings as IO.println.

      FYI(1):

      jshell> 'タ' == 0xff80
      $1 ==> true

      jshell> 'チ' == 0xff81
      $2 ==> true

      FYI(2):

      jshell> IO.println("\u227F");


      jshell> '≿' == 0x227f
      $16 ==> true

      ---------- BEGIN SOURCE ----------
      // Not reproduced in java --enable-preview ./SomeFile.java or javac --enable-preview --source 25 ./SomeFile.java && java --enable-preview SomeFile
      // Tested Java source (ended up to be output as is unlike in JShell):
      void main() {
          IO.println("ドラえもん");
      }
      ---------- END SOURCE ----------

            jlahoda Jan Lahoda
            webbuggrp Webbug Group
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated: