-
Bug
-
Resolution: Fixed
-
P4
-
11, 12
-
b15
Issue | Fix Version | Assignee | Priority | Status | Resolution | Resolved In Build |
---|---|---|---|---|---|---|
JDK-8328950 | 11-pool | Christoph Langer | P4 | Closed | Won't Fix |
ADDITIONAL SYSTEM INFORMATION :
$ java -version
openjdk version "11.0.1" 2018-10-16
OpenJDK Runtime Environment 18.9 (build 11.0.1+13)
OpenJDK 64-Bit Server VM 18.9 (build 11.0.1+13, mixed mode)
A DESCRIPTION OF THE PROBLEM :
When using the CASE_INSENSITIVE flag, the matching behavior of the POSIX character classes and a literal character class with the same set differs.
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
See test program.
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
The pattern "[a-z]" should behave the same as "\\p{Lower}" which in the docs it says is US-ASCII only and the same as "[a-z]".
ACTUAL -
When running with the CASE_INSENSITIVE flag, "[a-z]" will match an uppercase letter, but "\\p{Lower}" will not.
---------- BEGIN SOURCE ----------
// $ javac Test.java
// $ java -ea Test
// Exception in thread "main" java.lang.AssertionError
// at Test.main(Test.java:8)
import java.util.regex.Pattern;
public class Test {
public static void main(String[] args) {
Pattern p1 = Pattern.compile("[a-z]", Pattern.CASE_INSENSITIVE);
Pattern p2 = Pattern.compile("\\p{Lower}", Pattern.CASE_INSENSITIVE);
assert(p1.matcher("A").find() == p2.matcher("A").find());
}
}
---------- END SOURCE ----------
CUSTOMER SUBMITTED WORKAROUND :
Avoid using POSIX character classes.
FREQUENCY : always
$ java -version
openjdk version "11.0.1" 2018-10-16
OpenJDK Runtime Environment 18.9 (build 11.0.1+13)
OpenJDK 64-Bit Server VM 18.9 (build 11.0.1+13, mixed mode)
A DESCRIPTION OF THE PROBLEM :
When using the CASE_INSENSITIVE flag, the matching behavior of the POSIX character classes and a literal character class with the same set differs.
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
See test program.
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
The pattern "[a-z]" should behave the same as "\\p{Lower}" which in the docs it says is US-ASCII only and the same as "[a-z]".
ACTUAL -
When running with the CASE_INSENSITIVE flag, "[a-z]" will match an uppercase letter, but "\\p{Lower}" will not.
---------- BEGIN SOURCE ----------
// $ javac Test.java
// $ java -ea Test
// Exception in thread "main" java.lang.AssertionError
// at Test.main(Test.java:8)
import java.util.regex.Pattern;
public class Test {
public static void main(String[] args) {
Pattern p1 = Pattern.compile("[a-z]", Pattern.CASE_INSENSITIVE);
Pattern p2 = Pattern.compile("\\p{Lower}", Pattern.CASE_INSENSITIVE);
assert(p1.matcher("A").find() == p2.matcher("A").find());
}
}
---------- END SOURCE ----------
CUSTOMER SUBMITTED WORKAROUND :
Avoid using POSIX character classes.
FREQUENCY : always
- backported by
-
JDK-8328950 Case insensitive matching doesn't work correctly for some character classes
- Closed
- csr for
-
JDK-8238984 Case insensitive matching doesn't work correctly for some character classes
- Closed
- relates to
-
JDK-8305733 Pattern.CASE_INSENSITIVE does not take effect in jdk11
- Closed
- links to
-
Review openjdk/jdk11u-dev/2062
There are no Sub-Tasks for this issue.