Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-4497678

java.util.regex: single '&' in character class doesn't parse right

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: P4 P4
    • 1.4.0
    • 1.4.0
    • core-libs
    • beta3
    • generic
    • generic
    • Verified



      Name: nt126004 Date: 08/29/2001


      java version "1.4.0-beta2"
      Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.0-beta2-b77)
      Java HotSpot(TM) Client VM (build 1.4.0-beta2-b77, mixed mode)


      A double ampersand in a character class is now treated as an
      operator, but a single ampersand should be taken literally.
      Looking at the source code, I see that it's meant to work that
      way, but there's a bug. In the first run of the test program
      below, the regex "[&@]+" is tried. This should match the whole
      target string, "@@@@&&&&", but it only matches "@@@@", as if
      the '&' were being ignored. But the next run, using the regex
      "[@&]+", shows what's really happening: the character following
      the '&' is being processed in the its place, so that the ']'
      gets treated as a literal character, and the class never ends.
      The third run shows that you can still include a literal '&' in
      a character class by escaping it, but that shouldn't be
      necessary.


      //====================== sample code ===========================

      import java.util.regex.*;

      public class PatternTest
      {
        public static void main(String[] argv)
        {
          Pattern p1 = Pattern.compile(argv[0]);
          Matcher m1 = p1.matcher("@@@@&&&&");
          System.out.println(m1.find() ? "found: " + m1.group()
                                       : "not found");
        }
      }

      //======================== output =============================

      $ java PatternTest '[&@]+'
      found: @@@@


      $ java PatternTest '[@&]+'
      Exception in thread "main" java.util.regex.PatternSyntaxException:
       unclosed character class around index 5
              [@&]+
                  ^
              at java.util.regex.Pattern.error(Pattern.java:1455)
              at java.util.regex.Pattern.clazz(Pattern.java:1916)
              at java.util.regex.Pattern.sequence(Pattern.java:1511)
              at java.util.regex.Pattern.expr(Pattern.java:1471)
              at java.util.regex.Pattern.compile(Pattern.java:1260)
              at java.util.regex.Pattern.<init>(Pattern.java:977)
              at java.util.regex.Pattern.compile(Pattern.java:736)
              at PatternTest.main(PatternTest.java:7)


      $ java PatternTest '[@\&]+'
      found: @@@@&&&&
      (Review ID: 130865)
      ======================================================================

            mmcclosksunw Michael Mccloskey (Inactive)
            nthompsosunw Nathanael Thompson (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated:
              Resolved:
              Imported:
              Indexed: