Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-4812232

REGRESSION: UTF-16 encoding not working for some charaters, \ud800 to \ud8ff .

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Not an Issue
    • Icon: P3 P3
    • None
    • 1.4.0
    • core-libs

      Name: nt126004 Date: 02/03/2003


      FULL PRODUCT VERSION :
      java version "1.4.0"
      Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.0-b92)
      Java HotSpot(TM) Client VM (build 1.4.0-b92, mixed mode)

      FULL OPERATING SYSTEM VERSION :
      Microsoft Windows 2000 [Version 5.00.2195]

      ADDITIONAL OPERATING SYSTEMS :

      We tested in both linux and windows platforms.

      A DESCRIPTION OF THE PROBLEM :
      The following code snippet explains the encoding proplem.


      String original = "\ud800"
      byte [] b = original.getBytes("UTF-16");

      String converted = new String ( b , "UTF-16");

      if(converted.equals(original))
      {
      //No problem
      }
      else
      {
        Problem in the UTF encoding ...
      }

      Note: This problem occurs only the unicode characters of
      range '\ud800' to '\udfff'. For all other characters there
      is no problem. Also, this problem doesn't occur in jdk1.3
      version.

      REGRESSION. Last worked in version 1.3

      STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
      1Run the above given code snippet in jdk1.4

      EXPECTED VERSUS ACTUAL BEHAVIOR :
      The 'original' and 'converted' strings should be equals is
      the expected answer and what i am getting is "not equals".

      REPRODUCIBILITY :
      This bug can be reproduced always.

      ---------- BEGIN SOURCE ----------
      public class Test {
      public static void main(String[] args) {
      String original = "\ud800";
      byte [] b = original.getBytes("UTF-16");

      String converted = new String ( b , "UTF-16");

      if(converted.equals(original))
      {
      System.out.println("strings are equal, OK!");
      }
      else
      {
      System.out.println("strings are not equal, BAD!");
      }
      }
      }

      ---------- END SOURCE ----------

      CUSTOMER WORKAROUND :
      No idea

      Release Regression From : 1.3.1
      The above release value was the last known release where this
      bug was known to work. Since then there has been a regression.

      (Review ID: 180554)
      ======================================================================

            ilittlesunw Ian Little (Inactive)
            nthompsosunw Nathanael Thompson (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated:
              Resolved:
              Imported:
              Indexed: