-
Bug
-
Resolution: Fixed
-
P3
-
1.4.0
-
None
-
11
-
generic
-
generic
Issue | Fix Version | Assignee | Priority | Status | Resolution | Resolved In Build |
---|---|---|---|---|---|---|
JDK-2046717 | 1.4.0 | Ian Little | P3 | Resolved | Fixed | beta3 |
JDK-2046716 | 1.3.1_03 | Ian Little | P3 | Resolved | Fixed | 03 |
The GB18030 converter (the decoder part, byte->char) is currently performing
inadequate bounds checking on illegal or undefined native chars or char
sequences. Here is a summary of the checks which need to be improved/added:
1. The native char values, 0x80 and 0xff currently result in the throwing
of a java.io.MalformedInputException. These native chars are not
illegal in GB18030, they are just undefined. The converter should
perform substitution when it encounters these single bytes when
transcoding GB18030 input.
2. The decoder doesn't adequately reject illegal 2 byte sequences:
The 2 byte encodings with a second byte of 0x3a-0x40, 0x7F, and 0xFF are
invalid encodings, and should generate java.io.MalformedInputException
but they currently do not.
3. For the surrogate character range the decoder needs to provide substitution
chars where the input char encoded sequence corresponds to chars above
plane 16 as these chars are unassigned.
inadequate bounds checking on illegal or undefined native chars or char
sequences. Here is a summary of the checks which need to be improved/added:
1. The native char values, 0x80 and 0xff currently result in the throwing
of a java.io.MalformedInputException. These native chars are not
illegal in GB18030, they are just undefined. The converter should
perform substitution when it encounters these single bytes when
transcoding GB18030 input.
2. The decoder doesn't adequately reject illegal 2 byte sequences:
The 2 byte encodings with a second byte of 0x3a-0x40, 0x7F, and 0xFF are
invalid encodings, and should generate java.io.MalformedInputException
but they currently do not.
3. For the surrogate character range the decoder needs to provide substitution
chars where the input char encoded sequence corresponds to chars above
plane 16 as these chars are unassigned.
- backported by
-
JDK-2046716 GB18030 converter should peform additional bounds checks
-
- Resolved
-
-
JDK-2046717 GB18030 converter should peform additional bounds checks
-
- Resolved
-