Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8311939

Excessive allocation of Matcher.groups array

XMLWordPrintable

    • Icon: Enhancement Enhancement
    • Resolution: Fixed
    • Icon: P4 P4
    • 22
    • 11, 17, 21, 22
    • core-libs

      Spotted here:
       https://twitter.com/deathy/status/1679070832801316864

      See for example here:
       https://github.com/openjdk/jdk/blob/aa7367f1ecc5da15591963e56e1435aa7b830f79/src/java.base/share/classes/java/util/regex/Matcher.java#L250

      ```
              // Allocate state storage
              int parentGroupCount = Math.max(parent.capturingGroupCount, 10);
              groups = new int[parentGroupCount * 2];
      ```

      There seems to be little in clamping the groups array to 10 always, as we can go and allocate just `parent.capturingGroupCount * 2`.

      If we remove that clamp, then some tests would fail:

      ```
      test RegExTest.backRefTest(): failure
      java.lang.ArrayIndexOutOfBoundsException: Index 6 out of bounds for length 6
      at java.base/java.util.regex.Pattern$BackRef.match(Pattern.java:5190)
      at java.base/java.util.regex.Pattern$GroupTail.match(Pattern.java:5000)
      at java.base/java.util.regex.Pattern$Slice.match(Pattern.java:4268)
      at java.base/java.util.regex.Pattern$GroupHead.match(Pattern.java:4969)
      at java.base/java.util.regex.Pattern$GroupTail.match(Pattern.java:5000)
      at java.base/java.util.regex.Pattern$Slice.match(Pattern.java:4268)
      at java.base/java.util.regex.Pattern$GroupHead.match(Pattern.java:4969)
      at java.base/java.util.regex.Pattern$Start.match(Pattern.java:3787)
      at java.base/java.util.regex.Matcher.search(Matcher.java:1736)
      ```

      That is because Pattern.compile("abc\9") should still compile, as per Javadoc:

      "In this class, \1 through \9 are always interpreted as back references, "

      ...but the backref would try to get the Matcher.groups by large index and then fail with AIOOB.

      Remains to be seen if allocation clamp in Matcher can be removed without breaking the rest of the engine.

            rgiulietti Raffaello Giulietti
            shade Aleksey Shipilev
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: