Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8151244

URI Constructor Doesn't Encode Path Correctly

XMLWordPrintable

    • generic
    • generic

      FULL PRODUCT VERSION :


      A DESCRIPTION OF THE PROBLEM :
      According to the reference docs for this URI constructor: https://docs.oracle.com/javase/8/docs/api/java/net/URI.html#URI-java.lang.String-java.lang.String-java.lang.String-java.lang.String-java.lang.String-

      "If a path is given then it is appended. Any character not in the unreserved, punct, escaped, or other categories, and not equal to the slash character ('/') or the commercial-at character ('@'), is quoted."

      An escaped character is defined further up on the page as:
      "Escaped octets, that is, triplets consisting of the percent character ('%') followed by two hexadecimal digits ('0'-'9', 'A'-'F', and 'a'-'f')"

      論 is in the other category, so it doesn't surprise me this isn't encoded if I use it in the path directly. However if I encode it as %EF%A5%81, this should be considered an 'escaped' character. And, according to the docs, it should not be quoted. However, per the reproduction steps below, it is.

      As it stands, I see no way to use URI with these sorts of characters in the path. If I don't encode, they don't get quoted and fail over the wire. If I do, they get double encoded which results in an incorrect value. Furthermore, the only way to pre-encode I even see is URLEncoder which is specifically documented not to be for this usage.

      STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
              System.out.println(new URI(null, null, "論", null, null).getPath());
              System.out.println(new URI(null, null, "論", null, null).getRawPath());
              System.out.println(new URI(null, null, URLEncoder.encode("論", Constants.UTF8_CHARSET), null, null).getPath());
              System.out.println(new URI(null, null, URLEncoder.encode("論", Constants.UTF8_CHARSET), null, null).getRawPath());

      EXPECTED VERSUS ACTUAL BEHAVIOR :
      EXPECTED -



      %EF%A5%81
      ACTUAL -


      %EF%A5%81
      %25EF%25A5%2581

      REPRODUCIBILITY :
      This bug can be reproduced always.

            naoto Naoto Sato
            webbuggrp Webbug Group
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: