Loading...

XML

Word

Printable

Type: Bug
Resolution: Fixed
Priority: P4
Fix Version/s: 20
Affects Version/s: 8, 11, 17, 18, 19
Component/s: core-libs
Labels:

Subcomponent:
java.nio.charsets
Resolved In Build:
b09
CPU:

generic
OS:

generic

ADDITIONAL SYSTEM INFORMATION :
No limit，Works with all hardware System and OS

A DESCRIPTION OF THE PROBLEM :
1、The file where the bug is located
class file: sun.nio.cs.ext.IBM864.class
jar package file: charsets.jar

2、The main phenomenon of the bug
Test string: "<%adc"
(1) Using utf-8 character set encoding, its hexadecimal value sequence is:
3c 25 61 64 63
(2) Using IBM864 character set encoding, its hexadecimal value sequence is:
3c 3f 61 64 63
When using IBM864 character set encoding, the second character '%' in the string is encoded as 3f, which is '?' in the encoding set, instead of the expected value of 25, which is '%', there is an encoding wrong problem.

3、The root cause for the bug
After analysis, it is found that the encoding problem of the character set for the character '%' is caused by the fact that in the IBM864.java file, the fields b2cTable and b2c in the IBM864 class define the character (%) as the character ( ٪), rather than a character (%), causing the encoding to be inconsistent with the specification and expectations. Character set definition specification of IBM864, please refer to: https://www.compart.com/en/unicode/charsets/IBM864

REGRESSION : Last worked in version 19

STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Running the test program given below: TestIBM864 can reproduce the problem stably.

EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
str = <%adc, encoding = UTF-8
3c 25 61 64 63
str = <%adc, encoding = IBM864
3c 25 61 64 63
ACTUAL -
str = <%adc, encoding = UTF-8
3c 25 61 64 63
str = <%adc, encoding = IBM864
3c 3f 61 64 63

---------- BEGIN SOURCE ----------
public class TestIBM864 {

    public static void printBytesArray(byte[] bytesArr) throws Exception{
        for(byte b: bytesArr){
            System.out.printf("%x ", b);
        }
        System.out.println();
    }
    public static void testEncode(String encoding) throws Exception {
        String str = "<%adc";
        System.out.printf("str = %s, encoding = %s \n", str, encoding);
        printBytesArray(str.getBytes(encoding));
    }

    public static void main(String[] args) throws Exception {
        testEncode("UTF-8");
        testEncode("IBM864");
    }
}
---------- END SOURCE ----------

FREQUENCY : always

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

TestIBM864.java
2022-07-18 18:43
0.6 kB
Andrew Wang

links to

Commit openjdk/jdk/e52a340d

Review openjdk/jdk/9661

Assignee:: Naoto Sato

Reporter:: Webbug Group

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Created:: 2022-07-13 00:07

Updated:: 2022-08-08 11:11

Resolved:: 2022-08-03 09:10

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates