Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8296292

Document the default behavior of '$' in regular expressions correctly

    XMLWordPrintable

Details

    • b25
    • generic
    • generic

    Description

      A DESCRIPTION OF THE PROBLEM :
      The Javadoc for Pattern does not document the fact that $ (not in MULTILINE mode) will not only match at the very end of the input-sequence, but also right before a final line-terminator in the input-sequence.

      Currently, the Javadocs say "By default, the regular expressions ^ and $ ignore line terminators and only match at the beginning and the end, respectively, of the entire input sequence."

      But the following Code produces two matches, due to behavior apparently inherited from Perl:
      ---
      var m = Pattern.compile("\n\\z").matcher("\n\n");
      while (m.find()) {
          System.out.println(m.start() + ":" + m.end());
      }
      ---

      Internally, this behavior is documented in the Javadoc for the "Pattern.Dollar" inner-class:
      "When not in multiline mode, the $ can only match at the very end
      of the input, unless the input ends in a line terminator in which
      it matches right before the last line terminator."

      This should be reflected in the public Javadoc as well.


      FREQUENCY : always


      Attachments

        Issue Links

          Activity

            People

              rgiulietti Raffaello Giulietti
              webbuggrp Webbug Group
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: