Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-4802209

String and OutputStreamWriter classes sometimes encode UTF-8 incorrectly

XMLWordPrintable

    • b28
    • generic
    • solaris_8
    • Verified



      Name: dfR10049 Date: 01/13/2003


      The JCK tests for java.net/URL[Encoder/Decoder] fail if run after
      JCK tests for java_io package in the same JVM, on Solaris 2.8
      with LC_CTYPE set to "en_US.UTF-8".

      The bug is: URLEncoder.encode method with "UTF-8" encoding
      incorrectly processes surrogate pairs if an instance of InputStreamReader
      is created and new URL(http_url).openConnection().connect() is called before.

      So creating of InputStreamReader instance and connecting to the
      http url affect on the output of the following call:
         URLEncoder.encode("\uD800\uDC00 \uD801\uDC01 ", "UTF-8")


      I wrote the minimal as possible test demonstrating the bug:
      ----------------- EncTest.java ------------------
      import java.io.*;
      import java.net.*;

      public class EncTest {

          public static void main(String args[]) {
              try {
                  String toEncode = "\uD800\uDC00 \uD801\uDC01 ";
                  String enc1 = URLEncoder.encode(toEncode, "UTF-8");

      byte bytes[] = {};
      ByteArrayInputStream bais = new ByteArrayInputStream( bytes );
      InputStreamReader reader = new InputStreamReader( bais, "8859_1" );

                  new URL(args[0]).openConnection().connect();

                  String enc2 = URLEncoder.encode(toEncode, "UTF-8");
                  if (enc1.equals(enc2)) {
                      System.out.println("Test passed: ");
                  } else {
                      System.out.println("Test failed: ");
                  }
                  System.out.println(" enc1: " + enc1);
                  System.out.println(" enc2: " + enc2);

              } catch (Exception e) {
                  System.out.println(e);
              }

          }

      }
      -----------------------------------------
      #> uname -a
      SunOS matmech 5.8 Generic_108528-14 sun4u sparc SUNW,Ultra-5_10

      #> echo $LC_CTYPE
      en_US.UTF-8

      #> java -version
      java version "1.4.2-beta"
      Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.2-beta-b12)
      Java HotSpot(TM) Client VM (build 1.4.2-beta-b12, mixed mode)

      #> java EncTest <SOME AVAILABLE HTTP URL>
      Test failed:
          enc1: %F0%90%80%80+%F0%90%90%81+
          enc2: ++

      Note: the bug is reproducible with jdk1.4.2 b11 and b12.

      ======================================================================

            busersunw Btplusnull User (Inactive)
            fdasunw Fda Fda (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated:
              Resolved:
              Imported:
              Indexed: