-
Bug
-
Resolution: Not an Issue
-
P4
-
None
-
8u66, 9
-
generic
-
generic
FULL PRODUCT VERSION :
java version "1.8.0_65"
Java(TM) SE Runtime Environment (build 1.8.0_65-b17)
Java HotSpot(TM) Client VM (build 25.65-b01, mixed mode)
ADDITIONAL OS VERSION INFORMATION :
Windows 10
A DESCRIPTION OF THE PROBLEM :
There is a bug in java.regexp, that affects the boundary "\\b" + string + "\\b" , when the inner string contains the "$" special character
Example: Given the search string "$Eclipse", Assuming the text contains the search string as word, as example "available under the terms of the $Eclipse Public License" ..
The Pattern "\b\Q$Eclipse\E\b" doesn't match.
see also: https://bugs.eclipse.org/bugs/show_bug.cgi?id=487392
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Generate a Pattern "\b\Q$Eclipse\E\b"
Execute a find on the given text: "available under the terms of the $Eclipse $Public License"
Compare the find in case of non boundary with Pattern "\Q$Eclipse\E"
public static void main(String[] args) {
String search1 = "\\b\\Q$Eclipse\\E\\b";
String search2 = "\\Q$Public\\E";
String text = "available under the terms of the $Eclipse $Public License";
int patternFlags = Pattern.CASE_INSENSITIVE;
Pattern pattern1 = Pattern.compile(search1, patternFlags);
Pattern pattern2 = Pattern.compile(search2, patternFlags);
Matcher m1 = pattern1.matcher(text);
Matcher m2 = pattern2.matcher(text);
System.out.println(String.format("m1.find():%s%nm2.find():%s", m1.find(), m2.find()));
}
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
the Pattern should be able to find the word "$Eclipse" inside the test
m1.find():true
m2.find():true
ACTUAL -
there is no match on the search text
m1.find():false
m2.find():true
REPRODUCIBILITY :
This bug can be reproduced always.
---------- BEGIN SOURCE ----------
public static void main(String[] args) {
String search1 = "\\b\\Q$Eclipse\\E\\b";
String search2 = "\\Q$Public\\E";
String text = "available under the terms of the $Eclipse $Public License";
int patternFlags = Pattern.CASE_INSENSITIVE;
Pattern pattern1 = Pattern.compile(search1, patternFlags);
Pattern pattern2 = Pattern.compile(search2, patternFlags);
Matcher m1 = pattern1.matcher(text);
Matcher m2 = pattern2.matcher(text);
// the difference in output shows the bug
System.out.println(String.format("m1.find():%s%nm2.find():%s", m1.find(), m2.find()));
}
---------- END SOURCE ----------
CUSTOMER SUBMITTED WORKAROUND :
I've spent no time in a workaround.
The idea is: the dollar symbol, and possibly other special characters, are not working properly when enclosed between word boundary matchers(\b)
The solution might be a review of the specific regexp method.
java version "1.8.0_65"
Java(TM) SE Runtime Environment (build 1.8.0_65-b17)
Java HotSpot(TM) Client VM (build 25.65-b01, mixed mode)
ADDITIONAL OS VERSION INFORMATION :
Windows 10
A DESCRIPTION OF THE PROBLEM :
There is a bug in java.regexp, that affects the boundary "\\b" + string + "\\b" , when the inner string contains the "$" special character
Example: Given the search string "$Eclipse", Assuming the text contains the search string as word, as example "available under the terms of the $Eclipse Public License" ..
The Pattern "\b\Q$Eclipse\E\b" doesn't match.
see also: https://bugs.eclipse.org/bugs/show_bug.cgi?id=487392
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Generate a Pattern "\b\Q$Eclipse\E\b"
Execute a find on the given text: "available under the terms of the $Eclipse $Public License"
Compare the find in case of non boundary with Pattern "\Q$Eclipse\E"
public static void main(String[] args) {
String search1 = "\\b\\Q$Eclipse\\E\\b";
String search2 = "\\Q$Public\\E";
String text = "available under the terms of the $Eclipse $Public License";
int patternFlags = Pattern.CASE_INSENSITIVE;
Pattern pattern1 = Pattern.compile(search1, patternFlags);
Pattern pattern2 = Pattern.compile(search2, patternFlags);
Matcher m1 = pattern1.matcher(text);
Matcher m2 = pattern2.matcher(text);
System.out.println(String.format("m1.find():%s%nm2.find():%s", m1.find(), m2.find()));
}
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
the Pattern should be able to find the word "$Eclipse" inside the test
m1.find():true
m2.find():true
ACTUAL -
there is no match on the search text
m1.find():false
m2.find():true
REPRODUCIBILITY :
This bug can be reproduced always.
---------- BEGIN SOURCE ----------
public static void main(String[] args) {
String search1 = "\\b\\Q$Eclipse\\E\\b";
String search2 = "\\Q$Public\\E";
String text = "available under the terms of the $Eclipse $Public License";
int patternFlags = Pattern.CASE_INSENSITIVE;
Pattern pattern1 = Pattern.compile(search1, patternFlags);
Pattern pattern2 = Pattern.compile(search2, patternFlags);
Matcher m1 = pattern1.matcher(text);
Matcher m2 = pattern2.matcher(text);
// the difference in output shows the bug
System.out.println(String.format("m1.find():%s%nm2.find():%s", m1.find(), m2.find()));
}
---------- END SOURCE ----------
CUSTOMER SUBMITTED WORKAROUND :
I've spent no time in a workaround.
The idea is: the dollar symbol, and possibly other special characters, are not working properly when enclosed between word boundary matchers(\b)
The solution might be a review of the specific regexp method.