Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8266674

Interactive input replaces Unicode characters with \u0000

XMLWordPrintable

    • x86_64
    • windows_10

      ADDITIONAL SYSTEM INFORMATION :
      Windows 10
      PowerShell 7.1.3
      openjdk full version "16+36-2231"

      A DESCRIPTION OF THE PROBLEM :
      When using System.in (and anything that relies on it, such as Console::readLine/readPassword) to read input interactively, Unicode characters are replaced with \u0000.

      STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
      * create a file Issue.java with the given code
      * in PowerShell (i.e. https://github.com/PowerShell/PowerShell/ which is not the same as the Windows PowerShell), run the following command to set UTF-8:
      [Console]::InputEncoding = [Console]::OutputEncoding = [Text.Encoding]::UTF8
      * run the test case with interactive stdin as below & provide the requested input:
      java '-Dfile.encoding=UTF-8' Issue.java
      * run the test case with piped stdin:
      Write-Output "x`u{20ac}x" | java '-Dfile.encoding=UTF-8' Issue.java


      EXPECTED VERSUS ACTUAL BEHAVIOR :
      EXPECTED -
      With interactive stdin, the output is consistent as with piped stdin, namely:
      Copy-paste and enter the following: `x€x`: x€x
      interactive: `x€x` ([78, 20ac, 78])

      ACTUAL -
      With interactive stdin, the output is as follows:
      Copy-paste and enter the following: `x€x`: x€x
      interactive: `xx` ([78, 0, 78])


      ---------- BEGIN SOURCE ----------
      import java.io.Console;
      import java.io.IOException;
      import java.io.BufferedReader;
      import java.io.InputStreamReader;
      import static java.nio.charset.StandardCharsets.UTF_8;

      class Issue {
          public static void main(String... args) throws IOException {
              // [Console]::InputEncoding = [Console]::OutputEncoding = [Text.Encoding]::UTF8

              Console console = System.console();
              if(console != null) {
                  // java '-Dfile.encoding=UTF-8' Issue.java
                  System.out.print("Copy-paste and enter the following: `x\u20acx`: ");
                  BufferedReader in = new BufferedReader(new InputStreamReader(System.in, UTF_8));
                  String line = in.readLine();
                  print("interactive", line);
              } else {
                  // Write-Output "x`u{20ac}x" | java '-Dfile.encoding=UTF-8' Issue.java
                  String input = new String(System.in.readAllBytes(), UTF_8);
                  print("piped", input);
              }
          }

          static void print(String prefix, String message) {
              System.out.println("%s: `%s` (%s)".formatted(prefix, message, message.codePoints().mapToObj(Integer::toHexString).toList()));
          }
      }

      ---------- END SOURCE ----------

      FREQUENCY : always


            naoto Naoto Sato
            webbuggrp Webbug Group
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: