ADDITIONAL SYSTEM INFORMATION :
All systems across the entire universe and beyond.
A DESCRIPTION OF THE PROBLEM :
Per RFC 3986, characters:
unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"
reserved = gen-delims / sub-delims
gen-delims = ":" / "/" / "?" / "#" / "[" / "]" / "@"
sub-delims = "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," / ";" / "="
"For consistency, percent-encoded octets in the ranges of ALPHA (%41-%5A and %61-%7A), DIGIT (%30-%39), hyphen (%2D), period (%2E), underscore (%5F), or tilde (%7E) should not be created by URI producers and, when found in a URI, should be decoded to their corresponding unreserved characters by URI normalizers."
Per URLEncoder Documentation and Operation : The alphanumeric characters "a" through "z", "A" through "Z" and "0" through "9" remain the same. The special characters ".", "-", "*", and "_" remain the same.
Per RFC 3986, the ~ is not considered a reserved character, but the * is.
Thus, in URLEncoder, the ~ is encoded when it should not be, and the * is not encoded when it should be.
Per the current RFC, URLEncoder has no basis to consider * a "special character" that should not be encoded.
URLEncoder has more basis to consider ~ a "special character" that should not be encoded.
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
System.out.println( URLEncoder.encode( "~", "UTF-8" ) ) ;
System.out.println( URLEncoder.encode( "*", "UTF-8" ) ) ;
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
System.out.println( URLEncoder.encode( "~", "UTF-8" ) ) ; // expected result: ~
System.out.println( URLEncoder.encode( "*", "UTF-8" ) ) ; // expected result: %2A
ACTUAL -
System.out.println( URLEncoder.encode( "~", "UTF-8" ) ) ; // actual result: %7E
System.out.println( URLEncoder.encode( "*", "UTF-8" ) ) ; // actual result: *
---------- BEGIN SOURCE ----------
System.out.println( URLEncoder.encode( "~", "UTF-8" ) ) ;
System.out.println( URLEncoder.encode( "*", "UTF-8" ) ) ;
---------- END SOURCE ----------
CUSTOMER SUBMITTED WORKAROUND :
Use a different URL encoder.
FREQUENCY : always
All systems across the entire universe and beyond.
A DESCRIPTION OF THE PROBLEM :
Per RFC 3986, characters:
unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"
reserved = gen-delims / sub-delims
gen-delims = ":" / "/" / "?" / "#" / "[" / "]" / "@"
sub-delims = "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," / ";" / "="
"For consistency, percent-encoded octets in the ranges of ALPHA (%41-%5A and %61-%7A), DIGIT (%30-%39), hyphen (%2D), period (%2E), underscore (%5F), or tilde (%7E) should not be created by URI producers and, when found in a URI, should be decoded to their corresponding unreserved characters by URI normalizers."
Per URLEncoder Documentation and Operation : The alphanumeric characters "a" through "z", "A" through "Z" and "0" through "9" remain the same. The special characters ".", "-", "*", and "_" remain the same.
Per RFC 3986, the ~ is not considered a reserved character, but the * is.
Thus, in URLEncoder, the ~ is encoded when it should not be, and the * is not encoded when it should be.
Per the current RFC, URLEncoder has no basis to consider * a "special character" that should not be encoded.
URLEncoder has more basis to consider ~ a "special character" that should not be encoded.
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
System.out.println( URLEncoder.encode( "~", "UTF-8" ) ) ;
System.out.println( URLEncoder.encode( "*", "UTF-8" ) ) ;
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
System.out.println( URLEncoder.encode( "~", "UTF-8" ) ) ; // expected result: ~
System.out.println( URLEncoder.encode( "*", "UTF-8" ) ) ; // expected result: %2A
ACTUAL -
System.out.println( URLEncoder.encode( "~", "UTF-8" ) ) ; // actual result: %7E
System.out.println( URLEncoder.encode( "*", "UTF-8" ) ) ; // actual result: *
---------- BEGIN SOURCE ----------
System.out.println( URLEncoder.encode( "~", "UTF-8" ) ) ;
System.out.println( URLEncoder.encode( "*", "UTF-8" ) ) ;
---------- END SOURCE ----------
CUSTOMER SUBMITTED WORKAROUND :
Use a different URL encoder.
FREQUENCY : always