-
Bug
-
Resolution: Other
-
P4
-
None
-
16, 17
-
x86_64
-
windows_10
ADDITIONAL SYSTEM INFORMATION :
Windows 10
PowerShell 7.1.3
openjdk full version "16+36-2231"
A DESCRIPTION OF THE PROBLEM :
When using System.in (and anything that relies on it, such as Console::readLine/readPassword) to read input interactively, Unicode characters are replaced with \u0000.
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
* create a file Issue.java with the given code
* in PowerShell (i.e. https://github.com/PowerShell/PowerShell/ which is not the same as the Windows PowerShell), run the following command to set UTF-8:
[Console]::InputEncoding = [Console]::OutputEncoding = [Text.Encoding]::UTF8
* run the test case with interactive stdin as below & provide the requested input:
java '-Dfile.encoding=UTF-8' Issue.java
* run the test case with piped stdin:
Write-Output "x`u{20ac}x" | java '-Dfile.encoding=UTF-8' Issue.java
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
With interactive stdin, the output is consistent as with piped stdin, namely:
Copy-paste and enter the following: `x€x`: x€x
interactive: `x€x` ([78, 20ac, 78])
ACTUAL -
With interactive stdin, the output is as follows:
Copy-paste and enter the following: `x€x`: x€x
interactive: `xx` ([78, 0, 78])
---------- BEGIN SOURCE ----------
import java.io.Console;
import java.io.IOException;
import java.io.BufferedReader;
import java.io.InputStreamReader;
import static java.nio.charset.StandardCharsets.UTF_8;
class Issue {
public static void main(String... args) throws IOException {
// [Console]::InputEncoding = [Console]::OutputEncoding = [Text.Encoding]::UTF8
Console console = System.console();
if(console != null) {
// java '-Dfile.encoding=UTF-8' Issue.java
System.out.print("Copy-paste and enter the following: `x\u20acx`: ");
BufferedReader in = new BufferedReader(new InputStreamReader(System.in, UTF_8));
String line = in.readLine();
print("interactive", line);
} else {
// Write-Output "x`u{20ac}x" | java '-Dfile.encoding=UTF-8' Issue.java
String input = new String(System.in.readAllBytes(), UTF_8);
print("piped", input);
}
}
static void print(String prefix, String message) {
System.out.println("%s: `%s` (%s)".formatted(prefix, message, message.codePoints().mapToObj(Integer::toHexString).toList()));
}
}
---------- END SOURCE ----------
FREQUENCY : always
Windows 10
PowerShell 7.1.3
openjdk full version "16+36-2231"
A DESCRIPTION OF THE PROBLEM :
When using System.in (and anything that relies on it, such as Console::readLine/readPassword) to read input interactively, Unicode characters are replaced with \u0000.
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
* create a file Issue.java with the given code
* in PowerShell (i.e. https://github.com/PowerShell/PowerShell/ which is not the same as the Windows PowerShell), run the following command to set UTF-8:
[Console]::InputEncoding = [Console]::OutputEncoding = [Text.Encoding]::UTF8
* run the test case with interactive stdin as below & provide the requested input:
java '-Dfile.encoding=UTF-8' Issue.java
* run the test case with piped stdin:
Write-Output "x`u{20ac}x" | java '-Dfile.encoding=UTF-8' Issue.java
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
With interactive stdin, the output is consistent as with piped stdin, namely:
Copy-paste and enter the following: `x€x`: x€x
interactive: `x€x` ([78, 20ac, 78])
ACTUAL -
With interactive stdin, the output is as follows:
Copy-paste and enter the following: `x€x`: x€x
interactive: `xx` ([78, 0, 78])
---------- BEGIN SOURCE ----------
import java.io.Console;
import java.io.IOException;
import java.io.BufferedReader;
import java.io.InputStreamReader;
import static java.nio.charset.StandardCharsets.UTF_8;
class Issue {
public static void main(String... args) throws IOException {
// [Console]::InputEncoding = [Console]::OutputEncoding = [Text.Encoding]::UTF8
Console console = System.console();
if(console != null) {
// java '-Dfile.encoding=UTF-8' Issue.java
System.out.print("Copy-paste and enter the following: `x\u20acx`: ");
BufferedReader in = new BufferedReader(new InputStreamReader(System.in, UTF_8));
String line = in.readLine();
print("interactive", line);
} else {
// Write-Output "x`u{20ac}x" | java '-Dfile.encoding=UTF-8' Issue.java
String input = new String(System.in.readAllBytes(), UTF_8);
print("piped", input);
}
}
static void print(String prefix, String message) {
System.out.println("%s: `%s` (%s)".formatted(prefix, message, message.codePoints().mapToObj(Integer::toHexString).toList()));
}
}
---------- END SOURCE ----------
FREQUENCY : always