-
Bug
-
Resolution: Not an Issue
-
P4
-
11, 17, 18, 19
-
generic
-
generic
ADDITIONAL SYSTEM INFORMATION :
Ubuntu 18.04; OpenJDK (Temurin) 11, 17, 18;
A DESCRIPTION OF THE PROBLEM :
The java.util.regex.Pattern "[a-z&&[^a]a&&x]" should only match the String "x", but actually matches everything but "a".
This error frequently occurs with character class patterns involving two or more intersections on the same level where one of the intersected character classes is a union of a negation (left-hand side) and a singleton character or a character range (right-hand side).
---------- BEGIN SOURCE ----------
import java.util.Set;
import java.util.TreeSet;
import java.util.regex.Pattern;
import java.util.stream.IntStream;
import static java.lang.System.out;
import static java.util.stream.Collectors.toCollection;
public class JDK_PatternExperiments
{
public static void main( String... args )
{
var p = Pattern.compile("[a-z&&[^a]a&&x]");
var matches = IntStream
.rangeClosed('a', 'z')
.mapToObj(Character::toString)
.filter(p.asMatchPredicate())
.collect( toCollection(TreeSet::new) );
out.printf("matches: %s\n", matches);
if( ! Set.of("x").equals(matches) )
throw new AssertionError();
}
}
---------- END SOURCE ----------
CUSTOMER SUBMITTED WORKAROUND :
Adding brackets around the intersected character classes helps, e.g. "[a-z&&[[^a]a]&&x]" fixes the example above.
Ubuntu 18.04; OpenJDK (Temurin) 11, 17, 18;
A DESCRIPTION OF THE PROBLEM :
The java.util.regex.Pattern "[a-z&&[^a]a&&x]" should only match the String "x", but actually matches everything but "a".
This error frequently occurs with character class patterns involving two or more intersections on the same level where one of the intersected character classes is a union of a negation (left-hand side) and a singleton character or a character range (right-hand side).
---------- BEGIN SOURCE ----------
import java.util.Set;
import java.util.TreeSet;
import java.util.regex.Pattern;
import java.util.stream.IntStream;
import static java.lang.System.out;
import static java.util.stream.Collectors.toCollection;
public class JDK_PatternExperiments
{
public static void main( String... args )
{
var p = Pattern.compile("[a-z&&[^a]a&&x]");
var matches = IntStream
.rangeClosed('a', 'z')
.mapToObj(Character::toString)
.filter(p.asMatchPredicate())
.collect( toCollection(TreeSet::new) );
out.printf("matches: %s\n", matches);
if( ! Set.of("x").equals(matches) )
throw new AssertionError();
}
}
---------- END SOURCE ----------
CUSTOMER SUBMITTED WORKAROUND :
Adding brackets around the intersected character classes helps, e.g. "[a-z&&[[^a]a]&&x]" fixes the example above.