Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8038185

Regex possesive quantifier on match group causes group to keeps failed matches

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Duplicate
    • Icon: P4 P4
    • None
    • 8
    • core-libs

      FULL PRODUCT VERSION :
      java version "1.8.0"
      Java(TM) SE Runtime Environment (build 1.8.0-b132)
      Java HotSpot(TM) 64-Bit Server VM (build 25.0-b70, mixed mode)

      ADDITIONAL OS VERSION INFORMATION :
      Mac OS X Mavericks 10.9.2

      A DESCRIPTION OF THE PROBLEM :
      Consider the following code:

      public static void main(final String[] args) throws Exception {
          final Pattern patt = Pattern.compile("(a)?b");
          final String test = "azzzzb";
          final Matcher matcher = patt.matcher(test);
          while(matcher.find()) {
              System.out.println(matcher.group());
              System.out.println(matcher.group(1));
          }
      }

      Running the code produces the expected output:

      b
      null

      The engine fines the first "a" and attempts to match "b". It cannot so it proceeds along the string. When it reaches the "b" the pattern matches as the "a" is optional. The whole matched group is "b" and the first match group is empty.

      Changing the code to this:

      public static void main(final String[] args) throws Exception {
          final Pattern patt = Pattern.compile("(a)?+b");
          final String test = "azzzzb";
          final Matcher matcher = patt.matcher(test);
          while(matcher.find()) {
              System.out.println(matcher.group());
              System.out.println(matcher.group(1));
          }
      }

      Results in the output:

      b
      a

      This is incorrect.

      Now the engine matches the "a" at the start of the string and stores that in match group 1. When the match fails, this group is not cleared. When the match on only "b" succeeds later in the string, because group 1 is empty, the previously matched value is reported.

      If I change the code again to:

      public static void main(final String[] args) throws Exception {
          final Pattern patt = Pattern.compile("([ac])?+b");
          final String test = "azzzzcb";
          final Matcher matcher = patt.matcher(test);
          while(matcher.find()) {
              System.out.println(matcher.group());
              System.out.println(matcher.group(1));
          }
      }

      The output is:

      cb
      c

      Which is once again correct. It seems the issue occurs when an optional, possessive, group it matched but the rest of the pattern fails and then later the optional group is empty in a successful match.

      This bug seems to have been previously reported here:

      http://bugs.java.com/bugdatabase/view_bug.do?bug_id=8027747

      But there has been no comment and the issue is still present in JDK 8.

      STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
      Run the following code:

      public static void main(final String[] args) throws Exception {
          final Pattern patt = Pattern.compile("(a)?+b");
          final String test = "azzzzb";
          final Matcher matcher = patt.matcher(test);
          while(matcher.find()) {
              System.out.println(matcher.group());
              System.out.println(matcher.group(1));
          }
      }

      EXPECTED VERSUS ACTUAL BEHAVIOR :
      EXPECTED -
      The expected result is that the pattern matches "b" and that the whole match group contains only "b" and that first match group is empty or null.
      ACTUAL -
      The actual result is that the pattern matches "b" and the whole match group contains only "b" but the first match group contains "a".

      REPRODUCIBILITY :
      This bug can be reproduced always.

      ---------- BEGIN SOURCE ----------
      public static void main(final String[] args) throws Exception {
          final Pattern patt = Pattern.compile("(a)?+b");
          final String test = "azzzzb";
          final Matcher matcher = patt.matcher(test);
          while(matcher.find()) {
              assert matcher.group().equals("b");
              assert matcher.group(1) == null;
          }
      }
      ---------- END SOURCE ----------

      CUSTOMER SUBMITTED WORKAROUND :
      Do not use the possessive quantifier on optional match groups.

            sherman Xueming Shen
            webbuggrp Webbug Group
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: