-
Bug
-
Resolution: Fixed
-
P4
-
1.2.2
-
012
-
sparc
-
solaris_8
Issue | Fix Version | Assignee | Priority | Status | Resolution | Resolved In Build |
---|---|---|---|---|---|---|
JDK-2048677 | 1.4.1 | Ian Little | P4 | Closed | Fixed | hopper |
JDK-2048676 | 1.4.0_01 | Ian Little | P4 | Closed | Fixed | 01 |
JDK-2048675 | 1.3.1_03 | Ian Little | P4 | Resolved | Fixed | 03 |
JDK-2048674 | 1.2.2_12 | Ian Little | P4 | Resolved | Fixed | 12 |
Name: md23716 Date: 11/02/2001
Problem exists and requires fixing on 1.2.2, 1.3.1 and 1.4.
Encoding a zero'd byte array using the zh_TW locale results in an empty string. Same test with the default locale results in a non-empty string. The EUC_TW encoder is skipping valid zero'd bytes.
Simple testcase :
======================================================================
import java.io.*;
public class Exercise
{
public static void main(String[] args)
{
test("cns11643");
test("Cp1252");
}
public static void test(String encoding)
{
String result = null;
byte[] data = new byte[16];
int i;
System.err.println(">>>> " + encoding + " with zero'd byte array");
for (i = 0; i < 16; i++)
{
data[i] = 0;
}
try
{
result = new String(data, encoding);
System.err.println("length of string = " + result.length());
}
catch (Exception ex)
{
ex.printStackTrace();
}
for (i=0; i < 16; i++)
{
data[i] = (byte)( 32 + i);
}
System.err.println(">>>> " + encoding + " with non-zero'd byte array");
try
{
result = new String(data, encoding);
System.err.println("length of string = " + result.length());
}
catch (Exception ex)
{
ex.printStackTrace();
}
}
}
======================================================================
Suggested Fix :
Looking at the EUC_TW convertor code revealed that a valid character
(the "nil" character) was being used to filter out bad conversions.
Testcase passes when an invalid character (\FFFF) is used instead.
Context diff for ByteToCharEUC_TW.java :
======================================================================***************
*** 61,69 ****
throws UnknownCharacterException, MalformedInputException,
ConversionBufferFullException
{
int inputSize = 0;
! char outputChar = (char) 0;
byteOff = inOff;
charOff = outOff;
--- 61,69 ----
throws UnknownCharacterException, MalformedInputException,
ConversionBufferFullException
{
int inputSize = 0;
! char outputChar = '\uFFFF'; //ibm@37723
byteOff = inOff;
charOff = outOff;
***************
*** 150,158 ****
break;
}
byteOff++;
! if (outputChar != (char) 0) {
if (outputChar == REPLACE_CHAR) {
if (subMode) // substitution enabled
outputChar = subChars[0];
else {
--- 150,158 ----
break;
}
byteOff++;
! if (outputChar != '\uFFFF') { //ibm@37723
if (outputChar == REPLACE_CHAR) {
if (subMode) // substitution enabled
outputChar = subChars[0];
else {
***************
*** 160,168 ****
throw new UnknownCharacterException();
}
}
output[charOff++] = outputChar;
! outputChar = 0;
}
}
return charOff - outOff;
--- 160,168 ----
throw new UnknownCharacterException();
}
}
output[charOff++] = outputChar;
! outputChar = '\uFFFF'; //ibm@37723
}
}
return charOff - outOff;
======================================================================
======================================================================
- backported by
-
JDK-2048674 Encoding zero'd byte array using zh_TW locale results in empty string
-
- Resolved
-
-
JDK-2048675 Encoding zero'd byte array using zh_TW locale results in empty string
-
- Resolved
-
-
JDK-2048676 Encoding zero'd byte array using zh_TW locale results in empty string
-
- Closed
-
-
JDK-2048677 Encoding zero'd byte array using zh_TW locale results in empty string
-
- Closed
-