-
Bug
-
Resolution: Won't Fix
-
P3
-
None
-
7
-
None
FULL PRODUCT VERSION :
java version " 1.7.0_17 "
Java(TM) SE Runtime Environment (build 1.7.0_17-b02)
Java HotSpot(TM) Client VM (build 23.7-b01, mixed mode, sharing)
java version " 1.8.0-ea "
Java(TM) SE Runtime Environment (build 1.8.0-ea-b82)
Java HotSpot(TM) Client VM (build 25.0-b23, mixed mode)
ADDITIONAL OS VERSION INFORMATION :
Microsoft Windows XP [versie 5.1.2600]
A DESCRIPTION OF THE PROBLEM :
UTF-16 adds a byte-order mark (BOM) when used as encoding to get bytes from a String. It also writes a BOM when used together with a OutputStreamWriter.
But UTF-32 doesn't do that. With trial-and-error I figured out that Big-Endian was used for encoding, but this should be very explicit.
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
String[] charsets = { " UTF-16 " , " UTF-32 " };
for (String charset : charsets) {
for(byte b : " a " .getBytes(charset)){
System.out.format( " %02x " , b);
}
System.out.println();
ByteArrayOutputStream out = new ByteArrayOutputStream();
OutputStreamWriter writer = new OutputStreamWriter(out, charset);
writer.write( " a " );
writer.close();
for(byte b : out.toByteArray()){
System.out.format( " %02x " , b);
}
System.out.println();
}
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
fe ff 00 61
fe ff 00 61
00 00 fe ff 00 00 00 61
00 00 fe ff 00 00 00 61
ACTUAL -
fe ff 00 61
fe ff 00 61
00 00 00 61
00 00 00 61
REPRODUCIBILITY :
This bug can be reproduced always.
java version " 1.7.0_17 "
Java(TM) SE Runtime Environment (build 1.7.0_17-b02)
Java HotSpot(TM) Client VM (build 23.7-b01, mixed mode, sharing)
java version " 1.8.0-ea "
Java(TM) SE Runtime Environment (build 1.8.0-ea-b82)
Java HotSpot(TM) Client VM (build 25.0-b23, mixed mode)
ADDITIONAL OS VERSION INFORMATION :
Microsoft Windows XP [versie 5.1.2600]
A DESCRIPTION OF THE PROBLEM :
UTF-16 adds a byte-order mark (BOM) when used as encoding to get bytes from a String. It also writes a BOM when used together with a OutputStreamWriter.
But UTF-32 doesn't do that. With trial-and-error I figured out that Big-Endian was used for encoding, but this should be very explicit.
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
String[] charsets = { " UTF-16 " , " UTF-32 " };
for (String charset : charsets) {
for(byte b : " a " .getBytes(charset)){
System.out.format( " %02x " , b);
}
System.out.println();
ByteArrayOutputStream out = new ByteArrayOutputStream();
OutputStreamWriter writer = new OutputStreamWriter(out, charset);
writer.write( " a " );
writer.close();
for(byte b : out.toByteArray()){
System.out.format( " %02x " , b);
}
System.out.println();
}
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
fe ff 00 61
fe ff 00 61
00 00 fe ff 00 00 00 61
00 00 fe ff 00 00 00 61
ACTUAL -
fe ff 00 61
fe ff 00 61
00 00 00 61
00 00 00 61
REPRODUCIBILITY :
This bug can be reproduced always.