Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8288793

java.util.regex.Pattern "[a-z&&[^a]a&&x]" behaves like "[^a]"

    XMLWordPrintable

Details

    Description

      ADDITIONAL SYSTEM INFORMATION :
      Ubuntu 18.04; OpenJDK (Temurin) 11, 17, 18;

      A DESCRIPTION OF THE PROBLEM :
      The java.util.regex.Pattern "[a-z&&[^a]a&&x]" should only match the String "x", but actually matches everything but "a".

      This error frequently occurs with character class patterns involving two or more intersections on the same level where one of the intersected character classes is a union of a negation (left-hand side) and a singleton character or a character range (right-hand side).



      ---------- BEGIN SOURCE ----------
      import java.util.Set;
      import java.util.TreeSet;
      import java.util.regex.Pattern;
      import java.util.stream.IntStream;

      import static java.lang.System.out;
      import static java.util.stream.Collectors.toCollection;

      public class JDK_PatternExperiments
      {
        public static void main( String... args )
        {
      var p = Pattern.compile("[a-z&&[^a]a&&x]");

      var matches = IntStream
        .rangeClosed('a', 'z')
        .mapToObj(Character::toString)
        .filter(p.asMatchPredicate())
        .collect( toCollection(TreeSet::new) );

      out.printf("matches: %s\n", matches);
      if( ! Set.of("x").equals(matches) )
        throw new AssertionError();
        }
      }
      ---------- END SOURCE ----------

      CUSTOMER SUBMITTED WORKAROUND :
      Adding brackets around the intersected character classes helps, e.g. "[a-z&&[[^a]a]&&x]" fixes the example above.

      Attachments

        Activity

          People

            igraves Ian Graves
            webbuggrp Webbug Group
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: