Loading...

XML

Word

Printable

Type: Bug
Resolution: Unresolved
Priority: P4
Fix Version/s: tbd
Affects Version/s: 8, 9, 10, 11
Component/s: core-libs
Labels:

Subcomponent:
java.net
CPU:

generic
OS:

generic

ADDITIONAL SYSTEM INFORMATION :
All systems across the entire universe and beyond.

A DESCRIPTION OF THE PROBLEM :
Per RFC 3986, characters:
    unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"
    reserved = gen-delims / sub-delims
    gen-delims = ":" / "/" / "?" / "#" / "[" / "]" / "@"
    sub-delims = "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," / ";" / "="

"For consistency, percent-encoded octets in the ranges of ALPHA (%41-%5A and %61-%7A), DIGIT (%30-%39), hyphen (%2D), period (%2E), underscore (%5F), or tilde (%7E) should not be created by URI producers and, when found in a URI, should be decoded to their corresponding unreserved characters by URI normalizers."

Per URLEncoder Documentation and Operation : The alphanumeric characters "a" through "z", "A" through "Z" and "0" through "9" remain the same. The special characters ".", "-", "*", and "_" remain the same.

Per RFC 3986, the ~ is not considered a reserved character, but the * is.
Thus, in URLEncoder, the ~ is encoded when it should not be, and the * is not encoded when it should be.
Per the current RFC, URLEncoder has no basis to consider * a "special character" that should not be encoded.
URLEncoder has more basis to consider ~ a "special character" that should not be encoded.

STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
System.out.println( URLEncoder.encode( "~", "UTF-8" ) ) ;
System.out.println( URLEncoder.encode( "*", "UTF-8" ) ) ;

EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
System.out.println( URLEncoder.encode( "~", "UTF-8" ) ) ; // expected result: ~
System.out.println( URLEncoder.encode( "*", "UTF-8" ) ) ; // expected result: %2A
ACTUAL -
System.out.println( URLEncoder.encode( "~", "UTF-8" ) ) ; // actual result: %7E
System.out.println( URLEncoder.encode( "*", "UTF-8" ) ) ; // actual result: *

---------- BEGIN SOURCE ----------
System.out.println( URLEncoder.encode( "~", "UTF-8" ) ) ;
System.out.println( URLEncoder.encode( "*", "UTF-8" ) ) ;
---------- END SOURCE ----------

CUSTOMER SUBMITTED WORKAROUND :
Use a different URL encoder.

FREQUENCY : always

Assignee:: Chris Hegarty

Reporter:: Webbug Group

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Created:: 2018-06-05 07:58

Updated:: 2018-11-15 05:03

Details

Description

Attachments

Activity

People

Dates