Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8319228

Improve robustness of String constructors with mutable array inputs

    XMLWordPrintable

Details

    • CSR
    • Resolution: Approved
    • P3
    • 22
    • core-libs
    • None
    • behavioral
    • minimal
    • The compatibility risk is minimal, no behavior is specified for modifying the arguments.
    • Java API
    • SE

    Description

      Summary

      Warn against modification of array and CharSequence arguments to String constructors and methods in StringBuilder and Appendable; the results are not specified if the arguments are changed during the constructor or method.

      Problem

      The documentation for constructing strings and appending from arrays does not warn against modification of the arrays or CharSequences during the method calls. The affected java.lang classes are String, StringBuilder, and Appendable.

      Solution

      Add a warning to the constructors of String, and the methods of StringBuilder and Appendable that modification of arrays or CharSequence arguments can result in unspecified results.

      Specification

      java.lang.String:

       /**
        * Allocates a new {@code String} so that it represents the sequence of
        * characters currently contained in the character array argument. The
        * contents of the character array are copied; subsequent modification of
        * the character array does not affect the newly created string.
        *
      + * <p> The contents of the string are unspecified if the character array
      + * is modified during string construction.
      + *
        * @param  value
        *         The initial value of the string
        */
       public String(char[] value) {...}
      
       /**
        * Allocates a new {@code String} that contains characters from a subarray
        * of the character array argument. The {@code offset} argument is the
        * index of the first character of the subarray and the {@code count}
        * argument specifies the length of the subarray. The contents of the
        * subarray are copied; subsequent modification of the character array does
        * not affect the newly created string.
        *
      + * <p> The contents of the string are unspecified if the character array
      + * is modified during string construction.
      + *
        * @param  value
        *         Array that is the source of characters
        *
        * @param  offset
        *         The initial offset
        *
        * @param  count
        *         The length
        *
        * @throws  IndexOutOfBoundsException
        *          If {@code offset} is negative, {@code count} is negative, or
        *          {@code offset} is greater than {@code value.length - count}
        */
       public String(char[] value, int offset, int count) {...}
      
       /**
        * Allocates a new {@code String} that contains characters from a subarray
        * of the <a href="Character.html#unicode">Unicode code point</a> array
        * argument.  The {@code offset} argument is the index of the first code
        * point of the subarray and the {@code count} argument specifies the
        * length of the subarray.  The contents of the subarray are converted to
        * {@code char}s; subsequent modification of the {@code int} array does not
        * affect the newly created string.
        *
      + * <p> The contents of the string are unspecified if the codepoints array
      + * is modified during string construction.
      + *
        * @param  codePoints
        *         Array that is the source of Unicode code points
        *
        * @param  offset
        *         The initial offset
        *
        * @param  count
        *         The length
        *
        * @throws  IllegalArgumentException
        *          If any invalid Unicode code point is found in {@code
        *          codePoints}
        *
        * @throws  IndexOutOfBoundsException
        *          If {@code offset} is negative, {@code count} is negative, or
        *          {@code offset} is greater than {@code codePoints.length - count}
        *
        * @since  1.5
        */
       public String(int[] codePoints, int offset, int count) {...} 
      
       /**
        * Allocates a new {@code String} constructed from a subarray of an array
        * of 8-bit integer values.
        *
        * <p> The {@code offset} argument is the index of the first byte of the
        * subarray, and the {@code count} argument specifies the length of the
        * subarray.
        *
        * <p> Each {@code byte} in the subarray is converted to a {@code char} as
        * specified in the {@link #String(byte[],int) String(byte[],int)} constructor.
        *
      + * <p> The contents of the string are unspecified if the byte array
      + * is modified during string construction.
      + *
        * @deprecated This method does not properly convert bytes into characters.
        * As of JDK&nbsp;1.1, the preferred way to do this is via the
        * {@code String} constructors that take a {@link Charset}, charset name,
        * or that use the {@link Charset#defaultCharset() default charset}.
        *
        * @param  ascii
        *         The bytes to be converted to characters
        *
        * @param  hibyte
        *         The top 8 bits of each 16-bit Unicode code unit
        *
        * @param  offset
        *         The initial offset
        * @param  count
        *         The length
        *
        * @throws  IndexOutOfBoundsException
        *          If {@code offset} is negative, {@code count} is negative, or
        *          {@code offset} is greater than {@code ascii.length - count}
        *
        * @see  #String(byte[], int)
        * @see  #String(byte[], int, int, java.lang.String)
        * @see  #String(byte[], int, int, java.nio.charset.Charset)
        * @see  #String(byte[], int, int)
        * @see  #String(byte[], java.lang.String)
        * @see  #String(byte[], java.nio.charset.Charset)
        * @see  #String(byte[])
      */
      @Deprecated(since="1.1")
      public String(byte[] ascii, int hibyte, int offset, int count) {...} /** * Allocates a new {@code String} containing characters constructed from * an array of 8-bit integer values. Each character c in the * resulting string is constructed from the corresponding component * b in the byte array such that: * * ... * + * <p> The contents of the string are unspecified if the byte array + * is modified during string construction. + * * @deprecated This method does not properly convert bytes into * characters. As of JDK&nbsp;1.1, the preferred way to do this is via the * {@code String} constructors that take a {@link Charset}, charset name, * or that use the {@link Charset#defaultCharset() default charset}. * * @param ascii * The bytes to be converted to characters * * @param hibyte * The top 8 bits of each 16-bit Unicode code unit * * @see #String(byte[], int, int, java.lang.String) * @see #String(byte[], int, int, java.nio.charset.Charset) * @see #String(byte[], int, int) * @see #String(byte[], java.lang.String) * @see #String(byte[], java.nio.charset.Charset) * @see #String(byte[])
      */
      @Deprecated(since="1.1")
      public String(byte[] ascii, int hibyte) {...} /** * Constructs a new {@code String} by decoding the specified subarray of * bytes using the specified charset. The length of the new {@code String} * is a function of the charset, and hence may not be equal to the length * of the subarray. * * <p> The behavior of this constructor when the given bytes are not valid * in the given charset is unspecified. The {@link * java.nio.charset.CharsetDecoder} class should be used when more control * over the decoding process is required. * + * <p> The contents of the string are unspecified if the byte array + * is modified during string construction. + * * @param bytes * The bytes to be decoded into characters * * @param offset * The index of the first byte to decode * * @param length * The number of bytes to decode * * @param charsetName * The name of a supported {@linkplain java.nio.charset.Charset * charset} * * @throws UnsupportedEncodingException * If the named charset is not supported * * @throws IndexOutOfBoundsException * If {@code offset} is negative, {@code length} is negative, or * {@code offset} is greater than {@code bytes.length - length} * * @since 1.1 */ public String(byte[] bytes, int offset, int length, String charsetName) throws UnsupportedEncodingException {...} /** * Constructs a new {@code String} by decoding the specified subarray of * bytes using the specified {@linkplain java.nio.charset.Charset charset}. * The length of the new {@code String} is a function of the charset, and * hence may not be equal to the length of the subarray. * * <p> This method always replaces malformed-input and unmappable-character * sequences with this charset's default replacement string. The {@link * java.nio.charset.CharsetDecoder} class should be used when more control * over the decoding process is required. * + * <p> The contents of the string are unspecified if the byte array + * is modified during string construction. + * * @param bytes * The bytes to be decoded into characters * * @param offset * The index of the first byte to decode * * @param length * The number of bytes to decode * * @param charset * The {@linkplain java.nio.charset.Charset charset} to be used to * decode the {@code bytes} * * @throws IndexOutOfBoundsException * If {@code offset} is negative, {@code length} is negative, or * {@code offset} is greater than {@code bytes.length - length} * * @since 1.6 */ public String(byte[] bytes, int offset, int length, Charset charset) {...} /** * Constructs a new {@code String} by decoding the specified array of bytes * using the specified {@linkplain java.nio.charset.Charset charset}. The * length of the new {@code String} is a function of the charset, and hence * may not be equal to the length of the byte array. * * <p> The behavior of this constructor when the given bytes are not valid * in the given charset is unspecified. The {@link * java.nio.charset.CharsetDecoder} class should be used when more control * over the decoding process is required. * + * <p> The contents of the string are unspecified if the byte array + * is modified during string construction. + * * @param bytes * The bytes to be decoded into characters * * @param charsetName * The name of a supported {@linkplain java.nio.charset.Charset * charset} * * @throws UnsupportedEncodingException * If the named charset is not supported * * @since 1.1 */ public String(byte[] bytes, String charsetName) throws UnsupportedEncodingException {...} /** * Constructs a new {@code String} by decoding the specified array of * bytes using the specified {@linkplain java.nio.charset.Charset charset}. * The length of the new {@code String} is a function of the charset, and * hence may not be equal to the length of the byte array. * * <p> This method always replaces malformed-input and unmappable-character * sequences with this charset's default replacement string. The {@link * java.nio.charset.CharsetDecoder} class should be used when more control * over the decoding process is required. * + * <p> The contents of the string are unspecified if the byte array + * is modified during string construction. + * * @param bytes * The bytes to be decoded into characters * * @param charset * The {@linkplain java.nio.charset.Charset charset} to be used to * decode the {@code bytes} * * @since 1.6 */ public String(byte[] bytes, Charset charset) {...} /** * Constructs a new {@code String} by decoding the specified subarray of * bytes using the {@link Charset#defaultCharset() default charset}. * The length of the new {@code String} is a function of the charset, * and hence may not be equal to the length of the subarray. * * <p> The behavior of this constructor when the given bytes are not valid * in the default charset is unspecified. The {@link * java.nio.charset.CharsetDecoder} class should be used when more control * over the decoding process is required. * + * <p> The contents of the string are unspecified if the byte array + * is modified during string construction. + * * @param bytes * The bytes to be decoded into characters * * @param offset * The index of the first byte to decode * * @param length * The number of bytes to decode * * @throws IndexOutOfBoundsException * If {@code offset} is negative, {@code length} is negative, or * {@code offset} is greater than {@code bytes.length - length} * * @since 1.1 */ public String(byte[] bytes, int offset, int length) {...} /** * Constructs a new {@code String} by decoding the specified array of bytes * using the {@link Charset#defaultCharset() default charset}. The length * of the new {@code String} is a function of the charset, and hence may not * be equal to the length of the byte array. * * <p> The behavior of this constructor when the given bytes are not valid * in the default charset is unspecified. The {@link * java.nio.charset.CharsetDecoder} class should be used when more control * over the decoding process is required. * + * <p> The contents of the string are unspecified if the byte array + * is modified during string construction. + * * @param bytes * The bytes to be decoded into characters * * @since 1.1 */ public String(byte[] bytes) {...}

      /**
      * Allocates a new string that contains the sequence of characters
      * currently contained in the string builder argument. The contents of the
      * string builder are copied; subsequent modification of the string builder
      * does not affect the newly created string.
      *
      + * <p> The contents of the string are unspecified if the {@code StringBuilder}
      + * is modified during string construction.
      + *
      * <p> This constructor is provided to ease migration to {@code StringBuilder}.
      * Obtaining a string from a string builder via the {@code toString}
      * method is likely to run faster and is generally preferred.
      *
      * @param builder
      * A {@code StringBuilder}
      *
      * @since 1.5
      */
      public String(StringBuilder builder) {...} /**
      • Returns the string representation of the {@code char} array
      • argument. The contents of the character array are copied; subsequent
      • modification of the character array does not affect the returned
      • string.
        *
        + * <p> The contents of the string are unspecified if the character array
        + * is modified during string construction.
        + *
      • @param data the character array.
      • @return a {@code String} that contains the characters of the
      • character array.
        */
        public static String valueOf(char[]
      data) {...} /** * Returns the string representation of a specific subarray of the * {@code char} array argument. * <p> * The {@code offset} argument is the index of the first * character of the subarray. The {@code count} argument * specifies the length of the subarray. The contents of the subarray * are copied; subsequent modification of the character array does not * affect the returned string. * + * <p> The contents of the string are unspecified if the character array + * is modified during string construction. + * * @param data the character array. * @param offset initial offset of the subarray. * @param count length of the subarray. * @return a {@code String} that contains the characters of the * specified subarray of the character array. * @throws IndexOutOfBoundsException if {@code offset} is * negative, or {@code count} is negative, or * {@code offset+count} is larger than * {@code data.length}. */ public static String valueOf(char[] data, int offset, int count) {...}

      java.lang.StringBuilder:

       /**
        * Appends a subsequence of the specified {@code CharSequence} to this
        * sequence.
        * <p>
        * Characters of the argument {@code s}, starting at
        * index {@code start}, are appended, in order, to the contents of
        * this sequence up to the (exclusive) index {@code end}. The length
        * of this sequence is increased by the value of {@code end - start}.
        * <p>
        * Let <i>n</i> be the length of this character sequence just prior to
        * execution of the {@code append} method. Then the character at
        * index <i>k</i> in this character sequence becomes equal to the
        * character at index <i>k</i> in this sequence, if <i>k</i> is less than
        * <i>n</i>; otherwise, it is equal to the character at index
        * <i>k+start-n</i> in the argument {@code s}.
        * <p>
        * If {@code s} is {@code null}, then this method appends
        * characters as if the s parameter was a sequence containing the four
        * characters {@code "null"}.
      + * <p>
      + * The contents are unspecified if the {@code CharSequence}
      + * is modified during the method call or an exception is thrown
      + * when accessing the {@code CharSequence}.
        *
        * @param   s the sequence to append.
        * @param   start   the starting index of the subsequence to be appended.
        * @param   end     the end index of the subsequence to be appended.
        * @return  a reference to this object.
        * @throws     IndexOutOfBoundsException if
        *             {@code start} is negative, or
        *             {@code start} is greater than {@code end} or
        *             {@code end} is greater than {@code s.length()}
        */
       @Override
       public StringBuilder append(CharSequence s, int start, int end) {...}
      
       /**
        * Inserts the specified {@code CharSequence} into this sequence.
        * <p>
        * The characters of the {@code CharSequence} argument are inserted,
        * in order, into this sequence at the indicated offset, moving up
        * any characters originally above that position and increasing the length
        * of this sequence by the length of the argument s.
        * <p>
        * The result of this method is exactly the same as if it were an
        * invocation of this object's
        * {@link #insert(int,CharSequence,int,int) insert}(dstOffset, s, 0, s.length())
        * method.
      + * <p>
      + * The contents are unspecified if the {@code CharSequence}
      + * is modified during the method call or an exception is thrown
      + * when accessing the {@code CharSequence}.
        *
        * <p>If {@code s} is {@code null}, then the four characters
        * {@code "null"} are inserted into this sequence.
        *
        * @param      dstOffset   the offset.
        * @param      s the sequence to be inserted
        * @return     a reference to this object.
        * @throws     IndexOutOfBoundsException  if the offset is invalid.
        */
       public AbstractStringBuilder insert(int dstOffset, CharSequence s) {...}
      
       /**
        * Inserts a subsequence of the specified {@code CharSequence} into
        * this sequence.
        * <p>
        * The subsequence of the argument {@code s} specified by
        * {@code start} and {@code end} are inserted,
        * in order, into this sequence at the specified destination offset, moving
        * up any characters originally above that position. The length of this
        * sequence is increased by {@code end - start}.
        * <p>
        * The character at index <i>k</i> in this sequence becomes equal to:
        * <ul>
        * <li>the character at index <i>k</i> in this sequence, if
        * <i>k</i> is less than {@code dstOffset}
        * <li>the character at index <i>k</i>{@code +start-dstOffset} in
        * the argument {@code s}, if <i>k</i> is greater than or equal to
        * {@code dstOffset} but is less than {@code dstOffset+end-start}
        * <li>the character at index <i>k</i>{@code -(end-start)} in this
        * sequence, if <i>k</i> is greater than or equal to
        * {@code dstOffset+end-start}
        * </ul><p>
        * The {@code dstOffset} argument must be greater than or equal to
        * {@code 0}, and less than or equal to the {@linkplain #length() length}
        * of this sequence.
        * <p>The start argument must be nonnegative, and not greater than
        * {@code end}.
        * <p>The end argument must be greater than or equal to
        * {@code start}, and less than or equal to the length of s.
        *
        * <p>If {@code s} is {@code null}, then this method inserts
        * characters as if the s parameter was a sequence containing the four
        * characters {@code "null"}.
      + * <p>
      + * The contents are unspecified if the {@code CharSequence}
      + * is modified during the method call or an exception is thrown
      + * when accessing the {@code CharSequence}.
        *
        * @param      dstOffset   the offset in this sequence.
        * @param      s       the sequence to be inserted.
        * @param      start   the starting index of the subsequence to be inserted.
        * @param      end     the end index of the subsequence to be inserted.
        * @return     a reference to this object.
        * @throws     IndexOutOfBoundsException  if {@code dstOffset}
        *             is negative or greater than {@code this.length()}, or
        *              {@code start} or {@code end} are negative, or
        *              {@code start} is greater than {@code end} or
        *              {@code end} is greater than {@code s.length()}
        */
       public StringBuilder insert(int dstOffset, CharSequence s,
                                           int start, int end) {...} 
      
       /**
        * Appends {@code count} copies of the specified {@code CharSequence} {@code cs}
        * to this sequence.
        * <p>
        * The length of this sequence increases by {@code count} times the
        * {@code CharSequence} length.
        * <p>
        * If {@code cs} is {@code null}, then the four characters
        * {@code "null"} are repeated into this sequence.
      + * <p>
      + * The contents are unspecified if the {@code CharSequence}
      + * is modified during the method call or an exception is thrown
      + * when accessing the {@code CharSequence}.
        *
        * @param cs     a {@code CharSequence}
        * @param count  number of times to copy
        *
        * @return  a reference to this object.
        *
        * @throws IllegalArgumentException  if {@code count} is negative
        *
        * @since 21
        */
       public StringBuilder repeat(CharSequence cs, int count) {...}

      java.lang.Appendable:

       /**
        * Appends the specified character sequence to this {@code Appendable}.
        *
        * <p> Depending on which class implements the character sequence
        * {@code csq}, the entire sequence may not be appended.  For
        * instance, if {@code csq} is a {@link java.nio.CharBuffer} then
        * the subsequence to append is defined by the buffer's position and limit.
      + * <p>
      + * The contents are unspecified if the {@code CharSequence}
      + * is modified during the method call or an exception is thrown
      + * when accessing the {@code CharSequence}.
        *
        * @param  csq
        *         The character sequence to append.  If {@code csq} is
        *         {@code null}, then the four characters {@code "null"} are
        *         appended to this Appendable.
        *
        * @return  A reference to this {@code Appendable}
        *
        * @throws  IOException
        *          If an I/O error occurs
        */
       Appendable append(CharSequence csq) throws IOException;
      
       /**
        * Appends a subsequence of the specified character sequence to this
        * {@code Appendable}.
        *
        * <p> An invocation of this method of the form {@code out.append(csq, start, end)}
        * when {@code csq} is not {@code null}, behaves in
        * exactly the same way as the invocation
        *
        * <pre>
        *     out.append(csq.subSequence(start, end)) </pre>
        *
      + * <p>
      + * The contents are unspecified if the {@code CharSequence}
      + * is modified during the method call or an exception is thrown
      + * when accessing the {@code CharSequence}.
        * @param  csq
        *         The character sequence from which a subsequence will be
        *         appended.  If {@code csq} is {@code null}, then characters
        *         will be appended as if {@code csq} contained the four
        *         characters {@code "null"}.
        *
        * @param  start
        *         The index of the first character in the subsequence
        *
        * @param  end
        *         The index of the character following the last character in the
        *         subsequence
        *
        * @return  A reference to this {@code Appendable}
        *
        * @throws  IndexOutOfBoundsException
        *          If {@code start} or {@code end} are negative, {@code start}
        *          is greater than {@code end}, or {@code end} is greater than
        *          {@code csq.length()}
        *
        * @throws  IOException
        *          If an I/O error occurs
        */
       Appendable append(CharSequence csq, int start, int end) throws IOException;

      Attachments

        Issue Links

          Activity

            People

              rriggs Roger Riggs
              webbuggrp Webbug Group
              Brian Burkhalter, Lance Andersen, Raffaello Giulietti
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: