-
Bug
-
Resolution: Not an Issue
-
P3
-
None
-
19, 20, 21
-
b16
-
generic
-
generic
A DESCRIPTION OF THE PROBLEM :
This seems to be related to the changes related toJDK-8264160. In JDK 17 and earlier the following test would pass
var pattern= Pattern.compile("(?:\\b|\\d)"+"äst");
assertTrue("äst".matches(pattern.pattern()));
so a boundary match would match on the a-umlaut character. This behaviour seems to have received a breaking change with the bug earlier. Now (JDK 20) the test fails.
However as specified in the documentation the UNICODE_CHARACTER_CLASS should use the Unicode characters. I would expect this code to pass
var pattern= Pattern.compile("(?U)(?:\\b|\\d)"+"äst", Pattern.UNICODE_CHARACTER_CLASS);
assertTrue("äst".matches(pattern.pattern()));
but it fails in JDK 20 (and JDK 21 ea).
However "The UNICODE_CHARACTER_CLASS mode can also be enabled via the embedded flag expression (?U)."
var pattern2= Pattern.compile("(?U)(?:\\b|\\d)"+"äst");
assertTrue("äst".matches(pattern2.pattern()));
when using the flag, the test passes
REGRESSION : Last worked in version 17.0.8
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
run the code below
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
this code runs without exception
ACTUAL -
the test fails with an IllegalStateException
---------- BEGIN SOURCE ----------
var pattern= Pattern.compile("(?:\\b|\\d)"+"äst", Pattern.UNICODE_CHARACTER_CLASS);
if(!"äst".matches(pattern.pattern()))
throw new IllegalStateException("äst should match pattern1");
---------- END SOURCE ----------
FREQUENCY : always
This seems to be related to the changes related to
var pattern= Pattern.compile("(?:\\b|\\d)"+"äst");
assertTrue("äst".matches(pattern.pattern()));
so a boundary match would match on the a-umlaut character. This behaviour seems to have received a breaking change with the bug earlier. Now (JDK 20) the test fails.
However as specified in the documentation the UNICODE_CHARACTER_CLASS should use the Unicode characters. I would expect this code to pass
var pattern= Pattern.compile("(?U)(?:\\b|\\d)"+"äst", Pattern.UNICODE_CHARACTER_CLASS);
assertTrue("äst".matches(pattern.pattern()));
but it fails in JDK 20 (and JDK 21 ea).
However "The UNICODE_CHARACTER_CLASS mode can also be enabled via the embedded flag expression (?U)."
var pattern2= Pattern.compile("(?U)(?:\\b|\\d)"+"äst");
assertTrue("äst".matches(pattern2.pattern()));
when using the flag, the test passes
REGRESSION : Last worked in version 17.0.8
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
run the code below
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
this code runs without exception
ACTUAL -
the test fails with an IllegalStateException
---------- BEGIN SOURCE ----------
var pattern= Pattern.compile("(?:\\b|\\d)"+"äst", Pattern.UNICODE_CHARACTER_CLASS);
if(!"äst".matches(pattern.pattern()))
throw new IllegalStateException("äst should match pattern1");
---------- END SOURCE ----------
FREQUENCY : always