-
CSR
-
Resolution: Approved
-
P4
-
None
-
behavioral
-
minimal
-
The risk of memory exhaustion has always been present, although highly unlikely.
-
Java API
-
SE
Summary
When a java.util.regex.Pattern
is compiled with CANON_EQ
among the flags, there is a moderate risk of memory exhaustion.
Problem
When a Pattern
is compiled with CANON_EQ
among the flags, there's a moderate risk of memory exhaustion, depending on the complexity of the pattern. While the specification of this flag warns about a possible performance penalty, it does not address memory exhaustion which may occur during compilation, rather than matching.
Solution
Complete the specification by mentioning memory exhaustion as a low to moderate risk factor.
In addition, the implementation checks in advance whether the pattern is too complex, and throws OutOfMemoryError directly instead of attempting to allocate an amount of memory that will surely fail.
Specification
@@ -916,7 +916,8 @@ public final class Pattern
* <p> There is no embedded flag character for enabling canonical
* equivalence.
*
- * <p> Specifying this flag may impose a performance penalty. </p>
+ * <p> Specifying this flag may impose a performance penalty
+ * and a moderate risk of memory exhaustion.</p>
*/
public static final int CANON_EQ = 0x80;
@@ -1095,6 +1096,9 @@ public final class Pattern
* Compiles the given regular expression into a pattern with the given
* flags.
*
+ * <p>Setting {@link #CANON_EQ} among the flags may impose a moderate risk
+ * of memory exhaustion.</p>
+ *
* @param regex
* The expression to be compiled
*
@@ -1112,6 +1116,10 @@ public final class Pattern
*
* @throws PatternSyntaxException
* If the expression's syntax is invalid
+ *
+ * @implNote If {@link #CANON_EQ} is specified and the number of combining
+ * marks for any character is too large, an {@link java.lang.OutOfMemoryError}
+ * is thrown.
*/
public static Pattern compile(String regex, int flags) {
@@ -1145,6 +1153,13 @@ public final class Pattern
* The character sequence to be matched
*
* @return A new matcher for this pattern
+ *
+ * @implNote When a {@link Pattern} is deserialized, compilation is deferred
+ * until a direct or indirect invocation of this method. Thus, if a
+ * deserialized pattern has {@link #CANON_EQ} among its flags and the number
+ * of combining marks for any character is too large, an
+ * {@link java.lang.OutOfMemoryError} is thrown,
+ * as in {@link #compile(String, int)}.
*/
public Matcher matcher(CharSequence input) {
- csr of
-
JDK-8300207 Add a pre-check for the number of canonical equivalent permutations in j.u.r.Pattern
- Closed