Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8236688

Clarify String::stripIndent javadoc when string ends with line terminator

XMLWordPrintable

    • Icon: CSR CSR
    • Resolution: Approved
    • Icon: P3 P3
    • 15
    • core-libs
    • None
    • minimal
    • Clarification of existing javadoc. No change in behaviour.
    • Java API
    • SE

      Summary

      This change clarifies the method result when the original string ends with a line terminator.

      Problem

      Users are often surprised when the last line seeming disappears.

      Solution

      Update the javadoc to clarify what is actually happening.

      Specification

      diff -r 968b57610c0f src/java.base/share/classes/java/lang/String.java
      --- a/src/java.base/share/classes/java/lang/String.java Sat May 30 10:33:28 2020 +0530
      +++ b/src/java.base/share/classes/java/lang/String.java Mon Jun 01 08:06:02 2020 -0300
      @@ -2917,22 +2917,34 @@
            * |    </body>
            * |</html>
            * </pre></blockquote>
      -     * First, the individual lines of this string are extracted as if by using
      -     * {@link String#lines()}.
      +     * First, the individual lines of this string are extracted. A <i>line</i>
      +     * is a sequence of zero or more characters followed by either a line
      +     * terminator or the end of the string.
      +     * If the string has at least one line terminator, the last line consists
      +     * of the characters between the last terminator and the end of the string.
      +     * Otherwise, if the string has no terminators, the last line is the start
      +     * of the string to the end of the string, in other words, the entire
      +     * string.
      +     * A line does not include the line terminator.
            * <p>
      -     * Then, the <i>minimum indentation</i> (min) is determined as follows.
      -     * For each non-blank line (as defined by {@link String#isBlank()}), the
      -     * leading {@linkplain Character#isWhitespace(int) white space} characters are
      -     * counted. The leading {@linkplain Character#isWhitespace(int) white space}
      -     * characters on the last line are also counted even if
      -     * {@linkplain String#isBlank() blank}. The <i>min</i> value is the smallest
      -     * of these counts.
      +     * Then, the <i>minimum indentation</i> (min) is determined as follows:
      +     * <ul>
      +     *   <li><p>For each non-blank line (as defined by {@link String#isBlank()}),
      +     *   the leading {@linkplain Character#isWhitespace(int) white space}
      +     *   characters are counted.</p>
      +     *   </li>
      +     *   <li><p>The leading {@linkplain Character#isWhitespace(int) white space}
      +     *   characters on the last line are also counted even if
      +     *   {@linkplain String#isBlank() blank}.</p>
      +     *   </li>
      +     * </ul>
      +     * <p>The <i>min</i> value is the smallest of these counts.
            * <p>
            * For each {@linkplain String#isBlank() non-blank} line, <i>min</i> leading
      -     * {@linkplain Character#isWhitespace(int) white space} characters are removed,
      -     * and any trailing {@linkplain Character#isWhitespace(int) white space}
      -     * characters are removed. {@linkplain String#isBlank() Blank} lines are
      -     * replaced with the empty string.
      +     * {@linkplain Character#isWhitespace(int) white space} characters are
      +     * removed, and any trailing {@linkplain Character#isWhitespace(int) white
      +     * space} characters are removed. {@linkplain String#isBlank() Blank} lines
      +     * are replaced with the empty string.
            *
            * <p>
            * Finally, the lines are joined into a new string, using the LF character
      @@ -2943,12 +2955,11 @@
            * possible to the left, while preserving relative indentation. Lines
            * that were indented the least will thus have no leading
            * {@linkplain Character#isWhitespace(int) white space}.
      -     * The line count of the result will be the same as line count of this
      -     * string.
      +     * The result will have the same number of line terminators as this string.
            * If this string ends with a line terminator then the result will end
            * with a line terminator.
            *
      -     * @implNote
      +     * @implSpec
            * This method treats all {@linkplain Character#isWhitespace(int) white space}
            * characters as having equal width. As long as the indentation on every
            * line is consistently composed of the same character sequences, then the

      stripIndent

      public [String|String.html|class in java.lang] stripIndent()

      Returns a string whose value is this string, with incidental white space removed from the beginning and end of every line. Incidental white space is often present in a text block to align the content with the opening delimiter. For example, in the following code, dots represent incidental

      white<br /> space

      :

       String html = """
         ..............<html>
         ..............    <body>
         ..............        <p>Hello, world</p>
         ..............    </body>
         ..............</html>
         ..............""";

      This method treats the incidental

      white<br /> space

      as indentation to be stripped, producing a string that preserves the relative indentation of the content. Using | to visualize the start of each line of the string:

       |<html>
         |    <body>
         |        <p>Hello, world</p>
         |    </body>
         |</html>

      First, the individual lines of this string are extracted. A line is a sequence of zero or more characters followed by either a line terminator or the end of the string. If the string has at least one line terminator, the last line consists of the characters between the last terminator and the end of the string. Otherwise, if the string has no terminators, the last line is the start of the string to the end of the string, in other words, the entire string. A line does not include the line terminator. Then, the minimum indentation (min) is determined as follows:

      The min value is the smallest of these counts.

      For each non-blank line, min leading

      white<br /> space

      characters are removed, and any trailing white space characters are removed. Blank lines are replaced with the empty string.

      Finally, the lines are joined into a new string, using the LF character "\n" (U+000A) to separate lines.

      API Note:
      This method's primary purpose is to shift a block of lines as far as possible to the left, while preserving relative indentation. Lines that were indented the least will thus have no leading white space. The result will have the same number of line terminators as this string. If this string ends with a line terminator then the result will end with a line terminator.

      Implementation Requirements:
      This method treats all

      white<br /> space

      characters as having equal width. As long as the indentation on every line is consistently composed of the same character sequences, then the result will be as described above.

      Returns:
      string with incidental indentation removed and line terminators normalized

      Since:
      15

      See Also:
      <code class="prettyprint" data-shared-secret="1741941307353-0.02064024759270089">lines()</code>, <code class="prettyprint" data-shared-secret="1741941307353-0.02064024759270089">isBlank()</code>, <code class="prettyprint" data-shared-secret="1741941307353-0.02064024759270089">indent(int)</code>, <code class="prettyprint" data-shared-secret="1741941307353-0.02064024759270089">Character.isWhitespace(int)</code>

            jlaskey Jim Laskey
            jlaskey Jim Laskey
            Brent Christian, Roger Riggs
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: