Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8152841

sun.nio.cs.UnicodeDecoder incorrectly rejects U+FFFE

XMLWordPrintable

    • generic
    • generic

      FULL PRODUCT VERSION :


      A DESCRIPTION OF THE PROBLEM :
      sun.nio.cs.UnicodeDecoder incorrectly rejects U+FFFE.

      The test at http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/8u40-b25/sun/nio/cs/UnicodeDecoder.java#94 should be removed, because contrary to the comment on line 95, a reversed BOM *can* occur in the middle of a stream. The BOM/reversed-BOM are only special at the start of a stream, to distinguish UTF16BE from UTF16LE.

      From the unicode.org FAQ (http://www.unicode.org/faq/private_use.html#sentinel6):

      Q: I read somewhere that U+FFFE and U+FFFF were illegal in Unicode, and could be used as sentinels. Is that true?
      A: Well, the short answer is no, that is not true—at least, not entirely true. U+FFFE and U+FFFF are noncharacters just like the other 64 noncharacters in the standard, and are valid in Unicode strings.

      "Unicode 2.0 dropped the explicit prohibition against transmission or storage of U+FFFE and U+FFFF"

      Unicode 3.0: "To ensure that round-trip transcoding is possible, a UTF mapping must also map invalid Unicode scalar values to unique code value sequences. These invalid scalar values include U+FFFE, U+FFFF, and unpaired surrogates."

      Unicode 4.0: "To ensure that the mapping for a Unicode encoding form is one-to-one, all Unicode scalar values, including those corresponding to noncharacter code points and unassigned code points, must be mapped to unique code unit sequences."

      Mapping multiple codepoints to '\uFFFD' as is currently being done in sun.nio.cs.UnicodeDecoder means the encoding is not one-to-one.



      STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
      String a = "\uFFFE";
      new String(a.getBytes("UTF-16"), "UTF-16") == a;


      REPRODUCIBILITY :
      This bug can be reproduced always.

            psonal Pallavi Sonal (Inactive)
            webbuggrp Webbug Group
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: