Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8296292

Document the default behavior of '$' in regular expressions correctly

XMLWordPrintable

    • b25
    • generic
    • generic

      A DESCRIPTION OF THE PROBLEM :
      The Javadoc for Pattern does not document the fact that $ (not in MULTILINE mode) will not only match at the very end of the input-sequence, but also right before a final line-terminator in the input-sequence.

      Currently, the Javadocs say "By default, the regular expressions ^ and $ ignore line terminators and only match at the beginning and the end, respectively, of the entire input sequence."

      But the following Code produces two matches, due to behavior apparently inherited from Perl:
      ---
      var m = Pattern.compile("\n\\z").matcher("\n\n");
      while (m.find()) {
          System.out.println(m.start() + ":" + m.end());
      }
      ---

      Internally, this behavior is documented in the Javadoc for the "Pattern.Dollar" inner-class:
      "When not in multiline mode, the $ can only match at the very end
      of the input, unless the input ends in a line terminator in which
      it matches right before the last line terminator."

      This should be reflected in the public Javadoc as well.


      FREQUENCY : always


            rgiulietti Raffaello Giulietti
            webbuggrp Webbug Group
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: