-
Bug
-
Resolution: Fixed
-
P4
-
8, 11, 17, 18, 19
-
b09
-
generic
-
generic
ADDITIONAL SYSTEM INFORMATION :
No limit,Works with all hardware System and OS
A DESCRIPTION OF THE PROBLEM :
1、The file where the bug is located
class file: sun.nio.cs.ext.IBM864.class
jar package file: charsets.jar
2、The main phenomenon of the bug
Test string: "<%adc"
(1) Using utf-8 character set encoding, its hexadecimal value sequence is:
3c 25 61 64 63
(2) Using IBM864 character set encoding, its hexadecimal value sequence is:
3c 3f 61 64 63
When using IBM864 character set encoding, the second character '%' in the string is encoded as 3f, which is '?' in the encoding set, instead of the expected value of 25, which is '%', there is an encoding wrong problem.
3、The root cause for the bug
After analysis, it is found that the encoding problem of the character set for the character '%' is caused by the fact that in the IBM864.java file, the fields b2cTable and b2c in the IBM864 class define the character (%) as the character ( ٪), rather than a character (%), causing the encoding to be inconsistent with the specification and expectations. Character set definition specification of IBM864, please refer to: https://www.compart.com/en/unicode/charsets/IBM864
REGRESSION : Last worked in version 19
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Running the test program given below: TestIBM864 can reproduce the problem stably.
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
str = <%adc, encoding = UTF-8
3c 25 61 64 63
str = <%adc, encoding = IBM864
3c 25 61 64 63
ACTUAL -
str = <%adc, encoding = UTF-8
3c 25 61 64 63
str = <%adc, encoding = IBM864
3c 3f 61 64 63
---------- BEGIN SOURCE ----------
public class TestIBM864 {
public static void printBytesArray(byte[] bytesArr) throws Exception{
for(byte b: bytesArr){
System.out.printf("%x ", b);
}
System.out.println();
}
public static void testEncode(String encoding) throws Exception {
String str = "<%adc";
System.out.printf("str = %s, encoding = %s \n", str, encoding);
printBytesArray(str.getBytes(encoding));
}
public static void main(String[] args) throws Exception {
testEncode("UTF-8");
testEncode("IBM864");
}
}
---------- END SOURCE ----------
FREQUENCY : always
No limit,Works with all hardware System and OS
A DESCRIPTION OF THE PROBLEM :
1、The file where the bug is located
class file: sun.nio.cs.ext.IBM864.class
jar package file: charsets.jar
2、The main phenomenon of the bug
Test string: "<%adc"
(1) Using utf-8 character set encoding, its hexadecimal value sequence is:
3c 25 61 64 63
(2) Using IBM864 character set encoding, its hexadecimal value sequence is:
3c 3f 61 64 63
When using IBM864 character set encoding, the second character '%' in the string is encoded as 3f, which is '?' in the encoding set, instead of the expected value of 25, which is '%', there is an encoding wrong problem.
3、The root cause for the bug
After analysis, it is found that the encoding problem of the character set for the character '%' is caused by the fact that in the IBM864.java file, the fields b2cTable and b2c in the IBM864 class define the character (%) as the character ( ٪), rather than a character (%), causing the encoding to be inconsistent with the specification and expectations. Character set definition specification of IBM864, please refer to: https://www.compart.com/en/unicode/charsets/IBM864
REGRESSION : Last worked in version 19
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Running the test program given below: TestIBM864 can reproduce the problem stably.
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
str = <%adc, encoding = UTF-8
3c 25 61 64 63
str = <%adc, encoding = IBM864
3c 25 61 64 63
ACTUAL -
str = <%adc, encoding = UTF-8
3c 25 61 64 63
str = <%adc, encoding = IBM864
3c 3f 61 64 63
---------- BEGIN SOURCE ----------
public class TestIBM864 {
public static void printBytesArray(byte[] bytesArr) throws Exception{
for(byte b: bytesArr){
System.out.printf("%x ", b);
}
System.out.println();
}
public static void testEncode(String encoding) throws Exception {
String str = "<%adc";
System.out.printf("str = %s, encoding = %s \n", str, encoding);
printBytesArray(str.getBytes(encoding));
}
public static void main(String[] args) throws Exception {
testEncode("UTF-8");
testEncode("IBM864");
}
}
---------- END SOURCE ----------
FREQUENCY : always