-
Bug
-
Resolution: Duplicate
-
P3
-
None
-
1.4.2
-
x86
-
linux, windows_2000
Name: rmT116609 Date: 08/25/2003
FULL PRODUCT VERSION :
java version "1.4.1_04"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.1_04-b01)
Java HotSpot(TM) Client VM (build 1.4.1_04-b01, mixed mode)
java version "1.4.2_01"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.2_01-b06)
Java HotSpot(TM) Client VM (build 1.4.2_01-b06, mixed mode)
FULL OS VERSION :
Microsoft Windows 2000 [Version 5.00.2195]
A DESCRIPTION OF THE PROBLEM :
java.lang.Error received while attempting to convert a String with an invalid surrogate pair to a UTF-8 byte array using an OutputStreamWriter. The invalid surrogate pair is "broken" across two write operations. (See program segment in "Source code for an executable test case".)
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Run sample program under J2SDK 1.4.1.
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
The output expected is (as produced under JDK 1.3.1_08):
Expected: 6162636465666768696a6b6c6d6e6f3f3f70717273747576
Writing: 00610062006300640065006600670068 "abcdefgh"
Captured:6162636465666768
Writing: 0069006a006b006c006d006e006fd800 "ijklmno?"
Captured:696a6b6c6d6e6f3f
Writing: d8ff0070007100720073007400750076 "?pqrstuv"
Captured:3f70717273747576
Given the lookahead necessary to process well-formed surrogate pairs, the last several lines could (or should) have been:
Writing: 0069006a006b006c006d006e006fd800 "ijklmno?"
Captured:696a6b6c6d6e6f
Writing: d8ff0070007100720073007400750076 "?pqrstuv"
Captured:3f3f70717273747576
ACTUAL -
The output received was (as produced under J2SDK 1.4.1_04):
Expected: 6162636465666768696a6b6c6d6e6f3f3f70717273747576
Writing: 00610062006300640065006600670068 "abcdefgh"
Captured:6162636465666768
Writing: 0069006a006b006c006d006e006fd800 "ijklmno?"
Captured:696a6b6c6d6e6f
Writing: d8ff0070007100720073007400750076 "?pqrstuv"
java.lang.Error
at sun.nio.cs.StreamEncoder$CharsetSE.flushLeftoverChar(StreamEncoder.java:361)
at sun.nio.cs.StreamEncoder$CharsetSE.implWrite(StreamEncoder.java:381)
at sun.nio.cs.StreamEncoder.write(StreamEncoder.java:136)
at sun.nio.cs.StreamEncoder.write(StreamEncoder.java:146)
at java.io.OutputStreamWriter.write(OutputStreamWriter.java:204)
at java.io.Writer.write(Writer.java:126)
at TryCharsetConversionError.main(TryCharsetConversionError.java:54)
ERROR MESSAGES/STACK TRACES THAT OCCUR :
java.lang.Error
at sun.nio.cs.StreamEncoder$CharsetSE.flushLeftoverChar(StreamEncoder.java:361)
at sun.nio.cs.StreamEncoder$CharsetSE.implWrite(StreamEncoder.java:381)
at sun.nio.cs.StreamEncoder.write(StreamEncoder.java:136)
at sun.nio.cs.StreamEncoder.write(StreamEncoder.java:146)
at java.io.OutputStreamWriter.write(OutputStreamWriter.java:204)
at java.io.Writer.write(Writer.java:126)
at TryCharsetConversionError.main(TryCharsetConversionError.java:54)
REPRODUCIBILITY :
This bug can be reproduced always.
---------- BEGIN SOURCE ----------
import java.io.OutputStreamWriter;
import java.io.ByteArrayOutputStream;
import java.io.UnsupportedEncodingException;
import java.io.IOException;
/**
* The <code>TryCharsetConversionError</code> class
* provides a simple case exposing an error in J2SE 1.4
* handling of UTF-8 conversions with "invalid" surrogate
* pairs using an <code>OutputStreamWriter</code>.
*/
public class TryCharsetConversionError
{
public static void main(String[] args)
{
String encoding = "UTF8";
String strings[] = { "abcdefgh",
"ijklmno\uD800",
"\uD8FFpqrstuv" };
/*
* First, convert the full string.
*/
StringBuffer sb = new StringBuffer();
for ( int i = 0; i < strings.length; i++ )
sb.append(strings[i]);
String expected = sb.toString();
try
{
System.out.println("Expected: "
+ dump(expected.getBytes(encoding)));
}
catch (UnsupportedEncodingException e)
{
e.printStackTrace(System.out);
System.exit(3);
}
/*
* Now convert the string using a stream approach.
*/
ByteArrayOutputStream baos = new ByteArrayOutputStream(256);
try
{
OutputStreamWriter osw =
new OutputStreamWriter(baos, encoding);
for ( int i = 0; i < strings.length; i++ )
{
String s = strings[i];
System.out.println("Writing: "
+ dump(s) + " \"" + s + "\"");
osw.write(s);
osw.flush();
System.out.println("Captured:"
+ dump(baos.toByteArray()));
baos.reset();
}
}
catch (UnsupportedEncodingException e)
{
e.printStackTrace(System.out);
System.exit(4);
}
catch (IOException e)
{
e.printStackTrace(System.out);
System.exit(5);
}
}
private static String dump(String str)
{
byte[] bytes = new byte[str.length() * 2];
for ( int i = 0, j = 0; i < str.length(); i++, j += 2 )
{
char c = str.charAt(i);
bytes[j] = (byte) (c >>> 8);
bytes[j + 1] = (byte) (c);
}
return dump(bytes);
}
private static String dump(byte[] bytes)
{
StringBuffer sb = new StringBuffer();
for ( int i = 0; i < bytes.length; i++ )
{
sb.append(Integer.toHexString(bytes[i]
& 0xFF | 0x100).substring(1));
}
return sb.toString();
}
}
---------- END SOURCE ----------
Release Regression From : 1.3.1_09
The above release value was the last known release where this
bug was known to work. Since then there has been a regression.
(Incident Review ID: 200528)
======================================================================
FULL PRODUCT VERSION :
java version "1.4.1_04"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.1_04-b01)
Java HotSpot(TM) Client VM (build 1.4.1_04-b01, mixed mode)
java version "1.4.2_01"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.2_01-b06)
Java HotSpot(TM) Client VM (build 1.4.2_01-b06, mixed mode)
FULL OS VERSION :
Microsoft Windows 2000 [Version 5.00.2195]
A DESCRIPTION OF THE PROBLEM :
java.lang.Error received while attempting to convert a String with an invalid surrogate pair to a UTF-8 byte array using an OutputStreamWriter. The invalid surrogate pair is "broken" across two write operations. (See program segment in "Source code for an executable test case".)
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Run sample program under J2SDK 1.4.1.
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
The output expected is (as produced under JDK 1.3.1_08):
Expected: 6162636465666768696a6b6c6d6e6f3f3f70717273747576
Writing: 00610062006300640065006600670068 "abcdefgh"
Captured:6162636465666768
Writing: 0069006a006b006c006d006e006fd800 "ijklmno?"
Captured:696a6b6c6d6e6f3f
Writing: d8ff0070007100720073007400750076 "?pqrstuv"
Captured:3f70717273747576
Given the lookahead necessary to process well-formed surrogate pairs, the last several lines could (or should) have been:
Writing: 0069006a006b006c006d006e006fd800 "ijklmno?"
Captured:696a6b6c6d6e6f
Writing: d8ff0070007100720073007400750076 "?pqrstuv"
Captured:3f3f70717273747576
ACTUAL -
The output received was (as produced under J2SDK 1.4.1_04):
Expected: 6162636465666768696a6b6c6d6e6f3f3f70717273747576
Writing: 00610062006300640065006600670068 "abcdefgh"
Captured:6162636465666768
Writing: 0069006a006b006c006d006e006fd800 "ijklmno?"
Captured:696a6b6c6d6e6f
Writing: d8ff0070007100720073007400750076 "?pqrstuv"
java.lang.Error
at sun.nio.cs.StreamEncoder$CharsetSE.flushLeftoverChar(StreamEncoder.java:361)
at sun.nio.cs.StreamEncoder$CharsetSE.implWrite(StreamEncoder.java:381)
at sun.nio.cs.StreamEncoder.write(StreamEncoder.java:136)
at sun.nio.cs.StreamEncoder.write(StreamEncoder.java:146)
at java.io.OutputStreamWriter.write(OutputStreamWriter.java:204)
at java.io.Writer.write(Writer.java:126)
at TryCharsetConversionError.main(TryCharsetConversionError.java:54)
ERROR MESSAGES/STACK TRACES THAT OCCUR :
java.lang.Error
at sun.nio.cs.StreamEncoder$CharsetSE.flushLeftoverChar(StreamEncoder.java:361)
at sun.nio.cs.StreamEncoder$CharsetSE.implWrite(StreamEncoder.java:381)
at sun.nio.cs.StreamEncoder.write(StreamEncoder.java:136)
at sun.nio.cs.StreamEncoder.write(StreamEncoder.java:146)
at java.io.OutputStreamWriter.write(OutputStreamWriter.java:204)
at java.io.Writer.write(Writer.java:126)
at TryCharsetConversionError.main(TryCharsetConversionError.java:54)
REPRODUCIBILITY :
This bug can be reproduced always.
---------- BEGIN SOURCE ----------
import java.io.OutputStreamWriter;
import java.io.ByteArrayOutputStream;
import java.io.UnsupportedEncodingException;
import java.io.IOException;
/**
* The <code>TryCharsetConversionError</code> class
* provides a simple case exposing an error in J2SE 1.4
* handling of UTF-8 conversions with "invalid" surrogate
* pairs using an <code>OutputStreamWriter</code>.
*/
public class TryCharsetConversionError
{
public static void main(String[] args)
{
String encoding = "UTF8";
String strings[] = { "abcdefgh",
"ijklmno\uD800",
"\uD8FFpqrstuv" };
/*
* First, convert the full string.
*/
StringBuffer sb = new StringBuffer();
for ( int i = 0; i < strings.length; i++ )
sb.append(strings[i]);
String expected = sb.toString();
try
{
System.out.println("Expected: "
+ dump(expected.getBytes(encoding)));
}
catch (UnsupportedEncodingException e)
{
e.printStackTrace(System.out);
System.exit(3);
}
/*
* Now convert the string using a stream approach.
*/
ByteArrayOutputStream baos = new ByteArrayOutputStream(256);
try
{
OutputStreamWriter osw =
new OutputStreamWriter(baos, encoding);
for ( int i = 0; i < strings.length; i++ )
{
String s = strings[i];
System.out.println("Writing: "
+ dump(s) + " \"" + s + "\"");
osw.write(s);
osw.flush();
System.out.println("Captured:"
+ dump(baos.toByteArray()));
baos.reset();
}
}
catch (UnsupportedEncodingException e)
{
e.printStackTrace(System.out);
System.exit(4);
}
catch (IOException e)
{
e.printStackTrace(System.out);
System.exit(5);
}
}
private static String dump(String str)
{
byte[] bytes = new byte[str.length() * 2];
for ( int i = 0, j = 0; i < str.length(); i++, j += 2 )
{
char c = str.charAt(i);
bytes[j] = (byte) (c >>> 8);
bytes[j + 1] = (byte) (c);
}
return dump(bytes);
}
private static String dump(byte[] bytes)
{
StringBuffer sb = new StringBuffer();
for ( int i = 0; i < bytes.length; i++ )
{
sb.append(Integer.toHexString(bytes[i]
& 0xFF | 0x100).substring(1));
}
return sb.toString();
}
}
---------- END SOURCE ----------
Release Regression From : 1.3.1_09
The above release value was the last known release where this
bug was known to work. Since then there has been a regression.
(Incident Review ID: 200528)
======================================================================
- duplicates
-
JDK-4937360 sun.nio.cs.StreamEncoder throws java.lang.Error
-
- Closed
-
-
JDK-6275027 (cs) StreamEncoder throws uninformative Error when encoding unpaired surrogates
-
- Resolved
-