-
Bug
-
Resolution: Fixed
-
P3
-
20, 21
-
b18
-
b26
-
generic
-
generic
-
Verified
ADDITIONAL SYSTEM INFORMATION :
OS independent; bug can be seen in java.util.regex.Matcher source code, introduced in commit openjdk/jdk@ce85cac for issue 8065554.
A DESCRIPTION OF THE PROBLEM :
In addressing issueJDK-8065554 (MatchResult should provide values of named-capturing groups), commit openjdk/jdk@ce85cac added a namedGroups field in Matcher to cache the map from parentPattern.namedGroups().
The map is lazily cached, only when the field is null and namedGroups() is called (which may be indirectly through a call of start(String), end(String), or group(String). The same cached value will then continue to be returned even if Matcher.usePattern is later called and the new Pattern has different named groups, or no named groups, or the same named groups mapped to different integers. Therefore, symptoms can include seeing the wrong results when retrieving by named groups, or spurious IllegalArgumentExceptions for groups that the new pattern provides, or exceptions not thrown for groups that the new pattern doesn't provide, or exceptions for an invalid group index when calling a method that takes a group name.
Could be fixed by eliminating the local copy and simply having Matcher.namedGroups() call parentPattern.namedGroups() unconditionally, or by having Matcher.usePattern simply null the field, so the correct map will be lazily cached when next needed.
REGRESSION : Last worked in version 17
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
jshell in version 20:
var p1 = Pattern.compile("(?<a>...)(?<b>...)");
var p2 = Pattern.compile("(?<b>...)(?<a>...)");
var m = p1.matcher("foobar");
m.matches()
// ==> true
m.group("a")
// ==> "foo"
m.usePattern(p2)
m.matches()
// ==> true
m.group("a")
// ==> "foo" WRONG RESULT: should be "bar" for p2
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
Using p1, (?<a>...)(?<b>...), on the string "foobar" should return "foo" for group a.
Using p2, (?<b>...)(?<a>...), on the string "foobar" should return "bar" for group a.
ACTUAL -
After usePattern(p2) on the string "foobar", "foo" is incorrectly returned for group a, because the group name to group index mapping for pattern p1 is still cached.
CUSTOMER SUBMITTED WORKAROUND :
Calls to m.namedGroups() on a Matcher m can be replaced with calls to m.pattern().namedGroups(), and any calls to m.start(String), m.end(String), or m.group(String) can be replaced with code that maps the name string to an integer index using m.pattern().namedGroups(), throwing the proper exception if not found, and then calls start(int), end(int), or group(int).
In code that must also compile on Java < 20 where the namedGroups() method is unknown, a MethodHandle can be constructed for it at runtime, or variant code can be supplied in a multi-version jar.
OS independent; bug can be seen in java.util.regex.Matcher source code, introduced in commit openjdk/jdk@ce85cac for issue 8065554.
A DESCRIPTION OF THE PROBLEM :
In addressing issue
The map is lazily cached, only when the field is null and namedGroups() is called (which may be indirectly through a call of start(String), end(String), or group(String). The same cached value will then continue to be returned even if Matcher.usePattern is later called and the new Pattern has different named groups, or no named groups, or the same named groups mapped to different integers. Therefore, symptoms can include seeing the wrong results when retrieving by named groups, or spurious IllegalArgumentExceptions for groups that the new pattern provides, or exceptions not thrown for groups that the new pattern doesn't provide, or exceptions for an invalid group index when calling a method that takes a group name.
Could be fixed by eliminating the local copy and simply having Matcher.namedGroups() call parentPattern.namedGroups() unconditionally, or by having Matcher.usePattern simply null the field, so the correct map will be lazily cached when next needed.
REGRESSION : Last worked in version 17
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
jshell in version 20:
var p1 = Pattern.compile("(?<a>...)(?<b>...)");
var p2 = Pattern.compile("(?<b>...)(?<a>...)");
var m = p1.matcher("foobar");
m.matches()
// ==> true
m.group("a")
// ==> "foo"
m.usePattern(p2)
m.matches()
// ==> true
m.group("a")
// ==> "foo" WRONG RESULT: should be "bar" for p2
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
Using p1, (?<a>...)(?<b>...), on the string "foobar" should return "foo" for group a.
Using p2, (?<b>...)(?<a>...), on the string "foobar" should return "bar" for group a.
ACTUAL -
After usePattern(p2) on the string "foobar", "foo" is incorrectly returned for group a, because the group name to group index mapping for pattern p1 is still cached.
CUSTOMER SUBMITTED WORKAROUND :
Calls to m.namedGroups() on a Matcher m can be replaced with calls to m.pattern().namedGroups(), and any calls to m.start(String), m.end(String), or m.group(String) can be replaced with code that maps the name string to an integer index using m.pattern().namedGroups(), throwing the proper exception if not found, and then calls start(int), end(int), or group(int).
In code that must also compile on Java < 20 where the namedGroups() method is unknown, a MethodHandle can be constructed for it at runtime, or variant code can be supplied in a multi-version jar.
- relates to
-
JDK-8065554 MatchResult should provide values of named-capturing groups
-
- Resolved
-