Loading...

XML

Word

Printable

Type: CSR
Resolution: Approved
Priority: P4
Fix Version/s: 22
Component/s: core-libs
Labels:
None

Subcomponent:
java.net
Compatibility Kind:

behavioral
Compatibility Risk:
minimal
Compatibility Risk Description:
This is just documenting longstanding behaviour. There is no change in the implementation.
Interface Kind:

Java API
Scope:
SE

Summary

Updated the API documentation of URLEncoder.encode and URLDecoder.decode to reflect pre-existing behavior.

Problem

Currently the descriptions of URLEncoder.encode and URLDecoder.decode don't specify their use of replacement bytes or replacement character when they cannot handle a character or sequence of bytes. This is longstanding behavior but needs to be documented.

Solution

Added a new line to URLEncoder.encode API documentation to document that the charset's replacement bytes are used.

Also changed URLDecoder.decode API documentation to document its use of the charset's replacement character, also changed some wording and used apiNote.

updated the other decode methods in URLDecoder to reflect that they can throw IllegalArgumentException

Specification

java.net.URLEncoder.encode

     /**
      * Translates a string into {@code application/x-www-form-urlencoded}
      * format using a specific {@linkplain Charset Charset}.
      * This method uses the supplied charset to obtain the bytes for unsafe
      * characters.
      * <p>
  -   * <em><strong>Note:</strong> The <a href=
  +   * If the input string is malformed, or if the input cannot be mapped
  +   * to a valid byte sequence in the given {@code Charset}, then the
  +   * erroneous input will be replaced with the {@code Charset}'s
  +   * {@linkplain CharsetEncoder##cae replacement values}.
  +   *
  +   * @apiNote The <a href=
      * "http://www.w3.org/TR/html40/appendix/notes.html#non-ascii-chars">
      * World Wide Web Consortium Recommendation</a> states that
  -   * UTF-8 should be used. Not doing so may introduce incompatibilities.</em>
  -   *
  +   * UTF-8 should be used. Not doing so may introduce incompatibilities.
      * @param   s   {@code String} to be translated.
      * @param charset the given charset
      * @return  the translated {@code String}.
      * @param charset the given charset

java.net.URLDecoder.Decode

@@ -98,6 +98,8 @@ private URLDecoder() {}
     *          default charset. Instead, use the decode(String,String) method
     *          to specify the encoding.
     * @return the newly decoded {@code String}
 +   * @throws IllegalArgumentException if the implementation encounters malformed
 +   * escape sequences


@@ -113,9 +115,6 @@ public static String decode(String s) {
     * except that it will {@linkplain Charset#forName look up the charset}
     * using the given encoding name.
     *
 -   * @implNote This implementation will throw an {@link java.lang.IllegalArgumentException}
 -   * when illegal strings are encountered.
 -   *


@@ -124,6 +123,8 @@ public static String decode(String s) {
     * @throws UnsupportedEncodingException
     *             If character encoding needs to be consulted, but
     *             named character encoding is not supported
 +   * @throws IllegalArgumentException if the implementation encounters malformed
 +   * escape sequences


@@ -144,24 +145,23 @@ public static String decode(String s, String enc) throws UnsupportedEncodingExce
     * Decodes an {@code application/x-www-form-urlencoded} string using
     * a specific {@linkplain Charset Charset}.
     * The supplied charset is used to determine
 -   * what characters are represented by any consecutive sequences of the
 -   * form "<i>{@code %xy}</i>".
 +   * what characters are represented by any consecutive escape sequences of
 +   * the form "<i>{@code %xy}</i>". Erroneous bytes are replaced with the
 +   * supplied {@code Charset}'s {@linkplain java.nio.charset.CharsetDecoder##cae
 +   * replacement value}.
     * <p>
     * <em><strong>Note:</strong> The <a href=
     * "http://www.w3.org/TR/html40/appendix/notes.html#non-ascii-chars">
     * World Wide Web Consortium Recommendation</a> states that
     * UTF-8 should be used. Not doing so may introduce
     * incompatibilities.</em>
     *
 -   * @implNote This implementation will throw an {@link java.lang.IllegalArgumentException}
 -   * when illegal strings are encountered.
 -   *
     * @param s the {@code String} to decode
     * @param charset the given charset
     * @return the newly decoded {@code String}
     * @throws NullPointerException if {@code s} or {@code charset} is {@code null}
 -   * @throws IllegalArgumentException if the implementation encounters illegal
 -   * characters
 +   * @throws IllegalArgumentException if the implementation encounters malformed
 +   * escape sequences

csr of

JDK-8316734 URLEncoder should specify that replacement bytes will be used in case of coding error

Resolved

Assignee:: Darragh Clarke

Reporter:: Daniel Fuchs

Reviewed By:: Alan Bateman, Daniel Fuchs

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Created:: 2023-10-12 03:23

Updated:: 2023-11-28 09:29

Resolved:: 2023-11-28 09:29

Details

Description

Summary

Problem

Solution

Specification

Attachments

Issue Links

Activity

People

Dates