Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8253451

Performance regression in java.util.Scanner after 8236201

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: P4 P4
    • tbd
    • None
    • core-libs
    • generic
    • generic

      There is a performance regression in java.util.Scanner introduced by 8236201 (CPU 2020-04). JDK main line, 11u and 8u are affected; among (possibly) other releases.

      8236201 changed the Scanner::groupSeparator and Scanner::decimalSeparator regexp string patterns from simple escaping (\...) to a more complex one (\x{...}) [1][2].

      There is a cost in compilation time, as Pattern::x has to be called now [3]. The Pattern::Single automata node built is the same, so that shouldn't impact the Matcher's performance [4]. A few of the compilations occur only once per Scanner instance or per Scanner::useLocale method call, but there are other uses while scanning the input [5]. That should explain why we notice the performance degradation even with a single Scanner going through a large input.

      We can optimize for the default or most common cases (there are group and decimal separators that repeat in multiple locales).

      Attached to this ticket you'll find:

       * A reproducer (TestScannerNoIOManyStreams.java)

       * A simple playground code to compare regexp compilation times (Main.java)

      --
      [1] - https://hg.openjdk.java.net/jdk8u/jdk8u/jdk/rev/a8f0a9ef1797#l1.34
      [2] - https://hg.openjdk.java.net/jdk8u/jdk8u/jdk/rev/a8f0a9ef1797#l1.35
      [3] - https://hg.openjdk.java.net/jdk8u/jdk8u/jdk/file/84c5676f140b/src/share/classes/java/util/regex/Pattern.java#l3209
      [4] - https://hg.openjdk.java.net/jdk8u/jdk8u/jdk/file/84c5676f140b/src/share/classes/java/util/regex/Pattern.java#l3831
      [5] - https://hg.openjdk.java.net/jdk8u/jdk8u/jdk/file/84c5676f140b/src/share/classes/java/util/Scanner.java#l2042

            mbalao Martin Balao Alonso
            mbalao Martin Balao Alonso
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated: