Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8079633

sun.net.www.ParseUtil emits encoded srings it is not itself able to decode

XMLWordPrintable

      FULL PRODUCT VERSION :
      java version "1.8.0_40"
      Java(TM) SE Runtime Environment (build 1.8.0_40-b26)
      Java HotSpot(TM) 64-Bit Server VM (build 25.40-b25, mixed mode)


      ADDITIONAL OS VERSION INFORMATION :
      Linux tharbad 3.18.9-200.fc21.x86_64 #1 SMP Mon Mar 9 15:10:50 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux


      A DESCRIPTION OF THE PROBLEM :
      When encoding non-BMP characters, ParseUtil seems to assume ucs-2 instead of utf-16, causing the resulting utf-8 to be invalid. Apparently Java 7 was able to consume and reconstruct the correct utf-16 sequence when decoding this, but in Java 8 this results in the following exception:

      Welcome to JavaREPL version dev.build (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_40)
      Type expression to evaluate, :help for more options or press tab to auto-complete.
      java> import sun.net.www.ParseUtil;
      Imported sun.net.www.ParseUtil
      java> ParseUtil.decode(ParseUtil.encodePath(new String(Character.toChars(0x1F631))));
      java.lang.IllegalArgumentException: Error decoding percent encoded characters
      java> ParseUtil.encodePath(new String(Character.toChars(0x1F631)));
      java.lang.String res0 = "%ed%a0%bd%ed%b8%b1"

      The correct utf-8 sequence for U+1F631 is 0xF0 0x9F 0x98 0xB1.

      ParseUtil.encodePath is used by sun.misc.URLClassPath, in turn used by java.net.URLClassLoader.


      STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
      ParseUtil.encodePath(new String(Character.toChars(0x1F631)));

      EXPECTED VERSUS ACTUAL BEHAVIOR :
      EXPECTED -
      "%f0%9f%98%b1"
      ACTUAL -
      "%ed%a0%bd%ed%b8%b1"


      REPRODUCIBILITY :
      This bug can be reproduced always.

      ---------- BEGIN SOURCE ----------
      import sun.net.www.ParseUtil;

      public class Bug {
          public static void main(String args[]) throws Exception {
              final String emoji = new String(Character.toChars(0x1F631));
              final String encoded = ParseUtil.encodePath(emoji);
              System.out.println(encoded);
              final String decoded = ParseUtil.decode(encoded);
          }
      }

      ---------- END SOURCE ----------

            rpatil Ramanand Patil (Inactive)
            webbuggrp Webbug Group
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated: