Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8076759

AbstractStringBuilder.append(...) should ensure enough capacity for the upcoming "trivial" append calls

XMLWordPrintable

    • Icon: Enhancement Enhancement
    • Resolution: Unresolved
    • Icon: P4 P4
    • None
    • 9
    • core-libs

      Current implementation of AbstractStringBuilder.append does:

          */
          public AbstractStringBuilder append(String str) {
              ...
              int len = str.length();
              ensureCapacityInternal(count + len);
              ...
          }

          private void ensureCapacityInternal(int minimumCapacity) {
              if (minimumCapacity - value.length > 0)
                  expandCapacity(minimumCapacity);
          }

          void expandCapacity(int minimumCapacity) {
              int newCapacity = value.length * 2 + 2;
              if (newCapacity - minimumCapacity < 0)
                  newCapacity = minimumCapacity;
              ...
              value = Arrays.copyOf(value, newCapacity);
          }

      It would appear we double the value array every time... But no! If you append, say, a large String right away into a clean StringBuilder, and it blows ((value.length * 2 + 2) - minimumCapacity < 0) check, we expand SB to hold the String exactly. Then, an upcoming append() will have to resize *again*, even if some trivial space is required.

      E.g. these two tests differ wildly in performance:

      @State(Scope.Benchmark)
      @BenchmarkMode(Mode.AverageTime)
      @OutputTimeUnit(TimeUnit.NANOSECONDS)
      @Warmup(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
      @Measurement(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
      @Fork(3)
      public class ConcatOrderCopy {

          @Param({"1", "64", "4096"})
          int size;

          private String s;
          private long l;

          @Setup
          public void setup() {
              s = "";
              for (int c = 0; c < size; c++) {
                  s += "a";
              }
              l = Long.MAX_VALUE;
          }

          @Benchmark
          public String string_long() {
              return new StringBuilder().append(s).append(l).toString();
          }

          @Benchmark
          public String long_string() {
              return new StringBuilder().append(l).append(s).toString();
          }

      }

      On 8u40, Linux x86_64, it yields:

      Benchmark (size) Mode Cnt Score Error Units
      ConcatOrderCopy.long_string 1 avgt 15 75.292 ± 4.153 ns/op
      ConcatOrderCopy.long_string 64 avgt 15 117.067 ± 28.454 ns/op
      ConcatOrderCopy.long_string 4096 avgt 15 1402.182 ± 19.783 ns/op
      ConcatOrderCopy.string_long 1 avgt 15 70.958 ± 3.714 ns/op
      ConcatOrderCopy.string_long 64 avgt 15 125.451 ± 51.675 ns/op
      ConcatOrderCopy.string_long 4096 avgt 15 2640.751 ± 20.027 ns/op

      On large enough data, concatenating String+long is *much* slower than concatenating long+String.

      The answer seems to be to always allocate some gap at the end. Say, "minimumCapacity + 20"?

            shade Aleksey Shipilev
            shade Aleksey Shipilev
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated: