Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8178116

(scanner) scanner.findWithinHorizon doesn't advance after matching zero characters

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: P4 P4
    • tbd
    • 6, 7, 8
    • core-libs
    • None

      Matcher.find() has the behavior of advancing the cursor after a zero-length match. That way, even if a regex repeatedly matches zero characters, progress is still made through the input, and any enclosing loop will eventually terminate.

      Scanner.findWithinHorizon doesn't have this behavior. A simple loop over a regex that matches zero characters will return the same match every time, without progressing through the input, resulting in an infinite loop.

      Examples:

          static void matcherFind() {
              Pattern p = Pattern.compile("a*");
              Matcher m = p.matcher("abaab");
              while (m.find()) {
                  System.out.print("<" + m.group() + "> ");
              }
              System.out.println();
          }

      This results in the following output:

          <a> <> <aa> <> <>

      after which the method returns.

          static void scannerFind() {
              Scanner sc = new Scanner("abaab");
              Pattern p = Pattern.compile("a*");
              String s;
              while ((s = sc.findWithinHorizon(p, 0)) != null) {
                  System.out.print("<" + s + "> ");
              }
              System.out.println();
          }

      This results in the following output:

          <a> <> <> <> <> <> <> ...

      which never terminates.

      Since Scanner.findAll() is based on findWithinHorizon (really findPatternInBuffer) it will produce an infinite stream of empty matches if the regex matches zero characters.

      The workaround is to specify a regex that always matches at least one character. For the above example, using "a+" instead of "a*" won't result in any empty matches, but it will probably be sufficient for the above examples.

            smarks Stuart Marks
            smarks Stuart Marks
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated: