Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8150830

Degradation of String.replace() performance

XMLWordPrintable

    • x86
    • generic

      FULL PRODUCT VERSION :
      java version "1.8.0_74"
      Java(TM) SE Runtime Environment (build 1.8.0_74-b31)
      Java HotSpot(TM) Client VM (build 25.74-b31, mixed mode, sharing)

      ADDITIONAL OS VERSION INFORMATION :
      Version 6.1.7601

      EXTRA RELEVANT SYSTEM CONFIGURATION :
      Intel Core i5-4300U @ 1.90 GHz
      8 GB RAM

      A DESCRIPTION OF THE PROBLEM :
      After JRE build 1.8.0_66-b18 was upgraded to build 1.8.0_74-b31, the code using String.replace() started to crash with java.lang.OutOfMemoryError error. I have also noticed a significant drop in the performance of String.replace().

      REGRESSION. Last worked in version 8u66

      ADDITIONAL REGRESSION INFORMATION:
      java version "1.8.0_66"
      Java(TM) SE Runtime Environment (build 1.8.0_66-b18)
      Java HotSpot(TM) 64-Bit Server VM (build 25.66-b18, mixed mode)

      Note that the issue is reported on 32-bit version of JRE (after its upgrade) and the version used for regression is 64-bit (which has not been upgraded yet).

      STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
      I am running the test code on a 116 MB text file with about 1.6M lines. Each line has 73 characters + EOL, of which 8 characters are double quotes. The code reads the file line by line, removes double quotes and puts the result into a list. If I remove list operations from the code, the performance of String.replace() remains about the same and the code does not crash.

      EXPECTED VERSUS ACTUAL BEHAVIOR :
      EXPECTED -
      "C:\Program Files\Java\jre1.8.0_66\bin\java" -cp . test file.txt
      t1: 8, t2: 259 , t3: 10
      t1: 3, t2: 170 , t3: 1
      t1: 1, t2: 149 , t3: 2
      t1: 4, t2: 147 , t3: 2
      t1: 2, t2: 169 , t3: 0
      t1: 2, t2: 140 , t3: 3
      t1: 2, t2: 107 , t3: 3
      t1: 2, t2: 103 , t3: 5
      t1: 1, t2: 108 , t3: 2
      t1: 1, t2: 105 , t3: 3
      t1: 1, t2: 126 , t3: 3
      t1: 1, t2: 135 , t3: 2
      t1: 2, t2: 169 , t3: 5
      t1: 0, t2: 162 , t3: 2
      t1: 1, t2: 106 , t3: 1
      t1: 2, t2: 112 , t3: 1
      ACTUAL -
      "C:\Program Files (x86)\Java\jre1.8.0_74\bin\java" -cp . test file.txt
      t1: 8, t2: 489 , t3: 9
      t1: 3, t2: 529 , t3: 6
      t1: 1, t2: 455 , t3: 4
      t1: 4, t2: 394 , t3: 6
      t1: 2, t2: 390 , t3: 5
      t1: 2, t2: 433 , t3: 9
      t1: 7, t2: 386 , t3: 4
      t1: 5, t2: 462 , t3: 3
      t1: 5, t2: 584 , t3: 6
      t1: 1, t2: 357 , t3: 5
      t1: 3, t2: 349 , t3: 2
      t1: 4, t2: 1382 , t3: 1
      t1: 3, t2: 1552 , t3: 6
      t1: 1, t2: 1028 , t3: 7
      t1: 4, t2: 3414 , t3: 3
      Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
              at java.util.regex.Matcher.<init>(Unknown Source)
              at java.util.regex.Matcher.toMatchResult(Unknown Source)
              at java.util.Scanner.match(Unknown Source)
              at java.util.Scanner.hasNextLine(Unknown Source)
              at test.main(test.java:11)

      REPRODUCIBILITY :
      This bug can be reproduced always.

      ---------- BEGIN SOURCE ----------
      import java.nio.file.Paths;
      import java.util.ArrayList;
      import java.util.Scanner;

      public class test {
      public final static void main(String[] args) throws Exception {
      String sFileName = args[0];
      long lineCount = 0, t1 = 0, t2 = 0, t3 = 0;
      ArrayList<String> list = new ArrayList<String>();
      Scanner scanner = new Scanner(Paths.get(sFileName));
      while (scanner.hasNextLine()) {
      long t = System.currentTimeMillis();
      String sLine = scanner.nextLine();
      t1 += System.currentTimeMillis() - t;
      t = System.currentTimeMillis();
      String sLineMod = sLine.replace("\"", "");
      t2 += System.currentTimeMillis() - t;
      lineCount++;
      t = System.currentTimeMillis();
      list.add(sLineMod);
      t3 += System.currentTimeMillis() - t;
      if (lineCount % 100000 == 0) {
      System.out.println("t1: " + t1 + ", t2: " + t2 + " , t3: " + t3);
      t1 = 0;
      t2 = 0;
      t3 = 0;
      }
      }
      scanner.close();
      }
      }

      ---------- END SOURCE ----------

            Unassigned Unassigned
            webbuggrp Webbug Group
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: