Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-6271399

Pattern matching using regex takes a long time

XMLWordPrintable

    • b65
    • x86
    • windows_xp
    • Verified

        FULL PRODUCT VERSION :
        java version "1.4.2_08"
        Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.2_08-b03)
        Java HotSpot(TM) Client VM (build 1.4.2_08-b03, mixed mode)

        ADDITIONAL OS VERSION INFORMATION :
        Microsoft Windows XP [Version 5.1.2600]

        A DESCRIPTION OF THE PROBLEM :
        In trying to match a regular expression such as, "(a|e|i|o|u)?(f|v|ff|vv)(a|e|i|o|u)?(r|rr)(a|e|i|o|u)?(s|ss)(a|e|i|o|u)?" to any given string such as "mare" produces instant results. However, trying to match a regular expression that is lengthier than the above example takes three minutes or longer.

        STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
        Compile and execute the standalone code included.

        EXPECTED VERSUS ACTUAL BEHAVIOR :
        EXPECTED -
        The output should be:
        false
        RE match completed in 0.016 seconds.
        true
        RE match completed in 0.016 seconds.
        ACTUAL -
        false
        RE match completed in 0.016 seconds.
        true
        RE match completed in 160.922 seconds.

        REPRODUCIBILITY :
        This bug can be reproduced always.

        ---------- BEGIN SOURCE ----------
        import java.io.*;
        import java.util.*;
        import java.util.regex.*;


        public class Test
        {
           public static void main( String argv[] )
           {
              long startTime = System.currentTimeMillis();

              String s1 = "(a|e|i|o|u)?(f|v|ff|vv)(a|e|i|o|u)?(r|rr)(a|e|i|o|u)?(s|ss)(a|e|i|o|u)?";
              Pattern p = Pattern.compile( s1 );
              
              String mare = new String( "mare" );
              Matcher m = p.matcher( mare );
              
              boolean b = m.matches();
              System.out.println( b );

              long finishTime = System.currentTimeMillis();
              System.out.println( "RE match completed in " + (double)(finishTime - startTime)/1000 + " seconds.");
              
              
              startTime = System.currentTimeMillis();
              String s2 = "(a|e|i|o|u)?(f|v|ff|vv)(a|e|i|o|u)?(r|rr)(a|aa|a|null)?(n|nn)(a|e|i|o|u)?(k|kk)(a|e|i|o|u)?(f|v|ff|vv)(a|e|i|o|u)?(r|rr)(a|e|i|o|u)?(t|tt)(a|e|i|o|u)?";
              p = Pattern.compile( s2 );
              String frankfurt = new String( "frankfurt" );
              m = p.matcher( str );
              b = m.matches();
              System.out.println( b );

              finishTime = System.currentTimeMillis();
              System.out.println( "RE match completed in " + (double)(finishTime - startTime)/1000 + " seconds.");
           }
        }
        ---------- END SOURCE ----------

        CUSTOMER SUBMITTED WORKAROUND :
        I am in the process of writing my own regular expression pattern parser and matcher. If I didn't have to use Java, Perl and Python are able to accomplish this same task in seconds.
        ###@###.### 2005-05-17 04:52:07 GMT

              sherman Xueming Shen
              jssunw Jitender S (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated:
                Resolved:
                Imported:
                Indexed: