Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8169235

Java REGEX match error

XMLWordPrintable

      FULL PRODUCT VERSION :
      java version "1.7.0_85"
      Java(TM) SE Runtime Environment (build 1.7.0_85-b15)


      ADDITIONAL OS VERSION INFORMATION :
      Linux 2.6.39-400.211.1.el6uek.x86_64 #1 SMP Fri Nov 15
      13:39:16 PST 2013 x86_64 x86_64 x86_64 GNU/Linux


      A DESCRIPTION OF THE PROBLEM :
      Java 8 seems to have it fixed.

      The problem occurs in Java 7



      STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
                               Java 8 produces

      OK I found F() with the argument 'one'
      OK I found as.factor() with the argument 'two'
      OK I found factor() with the argument 'three'
      OK I found F() with the argument 'four'

                               Java 7 produces
      OK I found F() with the argument 'one'
      OK I found as.factor() with the argument 'two'
      OK I found factor() with the argument 'three'



      import java.util.regex.Matcher;
      import java.util.regex.Pattern;

      import static java.lang.System.out;

      /**
       * @author Dmitry Golovashkin. Created on 10/28/16.
       */
      public class Main {

        private static void asFactor(final String formula) {
          final String functionName = "\\b((as\\.)?factor|F)\\b"; // factor(ID), as.factor(ID), or F(ID).
          final String space = "\\s*";
          final String leftParenthesis = "\\(";
          final String id = "([^)]*)"; // May include leading/trailing space.
          final String rightParenthesis = "\\)";
          final String asFactorRegex = functionName + space + leftParenthesis + id + rightParenthesis;

          final Pattern asFactorPattern = Pattern.compile(asFactorRegex);
          final Matcher matcher = asFactorPattern.matcher(formula);
          final StringBuffer sb = new StringBuffer();

          while (matcher.find()) {
            final String factorName = matcher.group(3).trim();
            out.printf("OK I found %10s() with the argument \'%s\'%n", matcher.group(1), matcher.group(3));
          }
        }

        public static void main(String[] args) {
          final String formula = "y ~ F(one) + as.factor(two) + factor(three) + F(four)";
          out.println("input formula " + formula);
          out.println();

          asFactor(formula);
        }
      }


      EXPECTED VERSUS ACTUAL BEHAVIOR :
      EXPECTED -
      I was expecting to see

                               Java 8 produces (correct)

      OK I found F() with the argument 'one'
      OK I found as.factor() with the argument 'two'
      OK I found factor() with the argument 'three'
      OK I found F() with the argument 'four'


      However Java 7 matches just three.

      ACTUAL -
                               Java 7 produces
      OK I found F() with the argument 'one'
      OK I found as.factor() with the argument 'two'
      OK I found factor() with the argument 'three'

      and this is incorrect.
      Java 8 produces the correct result.

      REPRODUCIBILITY :
      This bug can be reproduced always.

      ---------- BEGIN SOURCE ----------
      import java.util.regex.Matcher;
      import java.util.regex.Pattern;

      import static java.lang.System.out;

      /**
       * @author Dmitry Golovashkin. Created on 10/28/16.
       * dmitry.golovashkin@oracle.com
       */
      public class Main {

        private static void asFactor(final String formula) {
          final String functionName = "\\b((as\\.)?factor|F)\\b"; // factor(ID), as.factor(ID), or F(ID).
          final String space = "\\s*";
          final String leftParenthesis = "\\(";
          final String id = "([^)]*)"; // May include leading/trailing space.
          final String rightParenthesis = "\\)";
          final String asFactorRegex = functionName + space + leftParenthesis + id + rightParenthesis;

          final Pattern asFactorPattern = Pattern.compile(asFactorRegex);
          final Matcher matcher = asFactorPattern.matcher(formula);
          final StringBuffer sb = new StringBuffer();

          while (matcher.find()) {
            final String factorName = matcher.group(3).trim();
            out.printf("OK I found %10s() with the argument \'%s\'%n", matcher.group(1), matcher.group(3));
          }
        }

        public static void main(String[] args) {
          final String formula = "y ~ F(one) + as.factor(two) + factor(three) + F(four)";
          out.println("input formula " + formula);
          out.println();

          asFactor(formula);
        }
      }

      ---------- END SOURCE ----------

            igerasim Ivan Gerasimov
            webbuggrp Webbug Group
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: