Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8296529

Document the default behavior of '$' in regular expressions correctly

XMLWordPrintable

    • Icon: CSR CSR
    • Resolution: Approved
    • Icon: P4 P4
    • 20
    • core-libs
    • None
    • behavioral
    • minimal
    • This is only a spec change, not a code change.
    • Java API
    • SE

      Summary

      The current spec does not fully document long standing behavior about the boundary matcher $.

      Problem

      The spec for java.util.regex.Pattern does not mention the fact that $, when not in MULTILINE mode, also matches a line terminator not followed by any other input character, in addition to the end of input sequence.

      Solution

      Adapt the spec to match established behavior.

      Specification

      --- a/src/java.base/share/classes/java/util/regex/Pattern.java
      +++ b/src/java.base/share/classes/java/util/regex/Pattern.java
      @@ -484,9 +484,15 @@ import jdk.internal.util.regex.Grapheme;
        * <p> The regular expression {@code .} matches any character except a line
        * terminator unless the {@link #DOTALL} flag is specified.
        *
      - * <p> By default, the regular expressions {@code ^} and {@code $} ignore
      - * line terminators and only match at the beginning and the end, respectively,
      - * of the entire input sequence. If {@link #MULTILINE} mode is activated then
      + * <p> If {@link #MULTILINE} mode is not activated, the regular expression
      + * {@code ^} ignores line terminators and only matches at the beginning of
      + * the entire input sequence. The regular expression {@code $} matches at the
      + * end of the entire input sequence, but also matches just before the last line
      + * terminator if this is not followed by any other input character. Other line
      + * terminators are ignored, including the last one if it is followed by other
      + * input characters.
      + *
      + * <p> If {@link #MULTILINE} mode is activated then
        * {@code ^} matches at the beginning of input and after any line terminator
        * except at the end of input. When in {@link #MULTILINE} mode {@code $}
        * matches just before a line terminator or the end of the input sequence.

            rgiulietti Raffaello Giulietti
            webbuggrp Webbug Group
            Stuart Marks
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: