Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-4496644

GB18030 converter should peform additional bounds checks

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: P3 P3
    • 1.2.2_11
    • 1.4.0
    • core-libs
    • None

        The GB18030 converter (the decoder part, byte->char) is currently performing
        inadequate bounds checking on illegal or undefined native chars or char
        sequences. Here is a summary of the checks which need to be improved/added:

        1. The native char values, 0x80 and 0xff currently result in the throwing
           of a java.io.MalformedInputException. These native chars are not
           illegal in GB18030, they are just undefined. The converter should
           perform substitution when it encounters these single bytes when
           transcoding GB18030 input.
        2. The decoder doesn't adequately reject illegal 2 byte sequences:
           The 2 byte encodings with a second byte of 0x3a-0x40, 0x7F, and 0xFF are
           invalid encodings, and should generate java.io.MalformedInputException
           but they currently do not.
        3. For the surrogate character range the decoder needs to provide substitution
           chars where the input char encoded sequence corresponds to chars above
           plane 16 as these chars are unassigned.

              ilittlesunw Ian Little (Inactive)
              ilittlesunw Ian Little (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated:
                Resolved:
                Imported:
                Indexed: