Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8357197

Add a pre-check for the number of canonical equivalent permutations in j.u.r.Pattern

XMLWordPrintable

    • Icon: CSR CSR
    • Resolution: Unresolved
    • Icon: P4 P4
    • 11-pool, 17-pool
    • core-libs
    • None
    • behavioral
    • minimal
    • The risk of memory exhaustion has always been present, although highly unlikely.
    • Java API
    • Implementation

      Summary

      When a java.util.regex.Pattern is compiled with CANON_EQ among the flags, there is a moderate risk of memory exhaustion.

      Problem

      When a Pattern is compiled with CANON_EQ among the flags, there's a moderate risk of memory exhaustion, depending on the complexity of the pattern. While the specification of this flag warns about a possible performance penalty, it does not address memory exhaustion which may occur during compilation, rather than matching.

      Solution

      Change the implementation to check in advance whether the pattern is too complex, and throws OutOfMemoryError directly instead of attempting to allocate an amount of memory that will surely fail. Also, changes needs to be made in @implNote as specified under specification

      Below two changes are removed in the CSR when compared to parent CSR

      -     * Specifying this flag may impose a performance penalty. 
      +     * Specifying this flag may impose a performance penalty
      +     * and a moderate risk of memory exhaustion.
      
      
      +     * Setting {@link #CANON_EQ} among the flags may impose a moderate risk
      +     * of memory exhaustion.
      +     *

      Specification

      @@ -1112,6 +1116,10 @@ public final class Pattern
            *
            * @throws  PatternSyntaxException
            *          If the expression's syntax is invalid
      +     *
      +     * @implNote If {@link #CANON_EQ} is specified and the number of combining
      +     * marks for any character is too large, an {@link java.lang.OutOfMemoryError}
      +     * is thrown.
            */
           public static Pattern compile(String regex, int flags) {
      
      @@ -1145,6 +1153,13 @@ public final class Pattern
            *         The character sequence to be matched
            *
            * @return  A new matcher for this pattern
      +     *
      +     * @implNote When a {@link Pattern} is deserialized, compilation is deferred
      +     * until a direct or indirect invocation of this method. Thus, if a
      +     * deserialized pattern has {@link #CANON_EQ} among its flags and the number
      +     * of combining marks for any character is too large, an
      +     * {@link java.lang.OutOfMemoryError} is thrown,
      +     * as in {@link #compile(String, int)}.
            */
           public Matcher matcher(CharSequence input) {

            jjose Johny Jose
            rgiulietti Raffaello Giulietti
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated: