Collections.singletonList is significantly slower than equivalents on ARM64

XMLWordPrintable

    • Type: Bug
    • Resolution: Unresolved
    • Priority: P4
    • tbd
    • Affects Version/s: 25.0.1
    • Component/s: core-libs
    • generic
    • generic

      ADDITIONAL SYSTEM INFORMATION :
          java.runtime.name = OpenJDK Runtime Environment
          java.runtime.version = 25.0.1+8-27
          java.specification.name = Java Platform API Specification
          java.specification.vendor = Oracle Corporation
          java.specification.version = 25
          java.vendor = Oracle Corporation
          java.vendor.url = https://java.oracle.com/
          java.vendor.url.bug = https://bugreport.java.com/bugreport/
          java.version = 25.0.1
          java.version.date = 2025-10-21
          java.vm.compressedOopsMode = Zero based
          java.vm.info = mixed mode, sharing
          java.vm.name = OpenJDK 64-Bit Server VM
          java.vm.specification.name = Java Virtual Machine Specification
          java.vm.specification.vendor = Oracle Corporation
          java.vm.specification.version = 25
          java.vm.vendor = Oracle Corporation
          java.vm.version = 25.0.1+8-27
          jdk.debug = release
          line.separator = \n
          native.encoding = UTF-8
          os.arch = aarch64
          os.name = Mac OS X
          os.version = 15.7.2
          path.separator = :
          stderr.encoding = UTF-8

      A DESCRIPTION OF THE PROBLEM :
      Recently I noticed an IDE warning where I was using Arrays::asList to construct a List with a single item. This lead me to check whether List.of or Collections.singletonList would be the best option to replace it. According to available documentation/literature online Collections.singletonList _should_ be faster than List.of but any difference is very marginal and both are preferable to Arrays:asList for constructing single item liss.

      On x64 / Windows this does seem to be true:

      ```
      Benchmark Mode Cnt Score Error Units
      Benchmark.asList thrpt 30 280207.761 ± 3616.566 ops/ms
      Benchmark.asList:·gc.alloc.rate thrpt 30 12826.813 ± 165.553 MB/sec
      Benchmark.asList:·gc.alloc.rate.norm thrpt 30 48.000 ± 0.001 B/op
      Benchmark.asList:·gc.count thrpt 30 18778.000 counts
      Benchmark.asList:·gc.time thrpt 30 7555.000 ms
      Benchmark.of thrpt 30 551714.274 ± 1430.914 ops/ms
      Benchmark.of:·gc.alloc.rate thrpt 30 12627.667 ± 32.751 MB/sec
      Benchmark.of:·gc.alloc.rate.norm thrpt 30 24.000 ± 0.001 B/op
      Benchmark.of:·gc.count thrpt 30 18702.000 counts
      Benchmark.of:·gc.time thrpt 30 7378.000 ms
      Benchmark.singletonList thrpt 30 552778.754 ± 986.674 ops/ms
      Benchmark.singletonList:·gc.alloc.rate thrpt 30 12652.031 ± 22.581 MB/sec
      Benchmark.singletonList:·gc.alloc.rate.norm thrpt 30 24.000 ± 0.001 B/op
      Benchmark.singletonList:·gc.count thrpt 30 18585.000 counts
      Benchmark.singletonList:·gc.time thrpt 30 7212.000 ms
      ```

      However, on OSX/ARM64 Collections::singletonList is _significantly_ slower than both (according to my simple benchmark).

      ```
      Benchmark Mode Cnt Score Error Units
      Benchmark.asList thrpt 30 292303.175 ± 2441.718 ops/ms
      Benchmark.asList:·gc.alloc.rate thrpt 30 13380.515 ± 111.773 MB/sec
      Benchmark.asList:·gc.alloc.rate.norm thrpt 30 48.000 ± 0.001 B/op
      Benchmark.asList:·gc.count thrpt 30 12866.000 counts
      Benchmark.asList:·gc.time thrpt 30 6820.000 ms
      Benchmark.of thrpt 30 519191.871 ± 12149.670 ops/ms
      Benchmark.of:·gc.alloc.rate thrpt 30 11883.295 ± 278.079 MB/sec
      Benchmark.of:·gc.alloc.rate.norm thrpt 30 24.000 ± 0.001 B/op
      Benchmark.of:·gc.count thrpt 30 12054.000 counts
      Benchmark.of:·gc.time thrpt 30 6342.000 ms
      Benchmark.singletonList thrpt 30 62301.767 ± 909.762 ops/ms
      Benchmark.singletonList:·gc.alloc.rate thrpt 30 1425.962 ± 20.823 MB/sec
      Benchmark.singletonList:·gc.alloc.rate.norm thrpt 30 24.000 ± 0.001 B/op
      Benchmark.singletonList:·gc.count thrpt 30 2118.000 counts
      Benchmark.singletonList:·gc.time thrpt 30 1833.000 ms
      Benchmark Mode Cnt Score Error Units
      Benchmark.asList thrpt 30 292303.175 ± 2441.718 ops/ms
      Benchmark.asList:·gc.alloc.rate thrpt 30 13380.515 ± 111.773 MB/sec
      Benchmark.asList:·gc.alloc.rate.norm thrpt 30 48.000 ± 0.001 B/op
      Benchmark.asList:·gc.count thrpt 30 12866.000 counts
      Benchmark.asList:·gc.time thrpt 30 6820.000 ms
      Benchmark.of thrpt 30 519191.871 ± 12149.670 ops/ms
      Benchmark.of:·gc.alloc.rate thrpt 30 11883.295 ± 278.079 MB/sec
      Benchmark.of:·gc.alloc.rate.norm thrpt 30 24.000 ± 0.001 B/op
      Benchmark.of:·gc.count thrpt 30 12054.000 counts
      Benchmark.of:·gc.time thrpt 30 6342.000 ms
      Benchmark.singletonList thrpt 30 62301.767 ± 909.762 ops/ms
      Benchmark.singletonList:·gc.alloc.rate thrpt 30 1425.962 ± 20.823 MB/sec
      Benchmark.singletonList:·gc.alloc.rate.norm thrpt 30 24.000 ± 0.001 B/op
      Benchmark.singletonList:·gc.count thrpt 30 2118.000 counts
      Benchmark.singletonList:·gc.time thrpt 30 1833.000 msBenchmark Mode Cnt Score Error Units
      Benchmark.asList thrpt 30 292303.175 ± 2441.718 ops/ms
      Benchmark.asList:·gc.alloc.rate thrpt 30 13380.515 ± 111.773 MB/sec
      Benchmark.asList:·gc.alloc.rate.norm thrpt 30 48.000 ± 0.001 B/op
      Benchmark.asList:·gc.count thrpt 30 12866.000 counts
      Benchmark.asList:·gc.time thrpt 30 6820.000 ms
      Benchmark.of thrpt 30 519191.871 ± 12149.670 ops/ms
      Benchmark.of:·gc.alloc.rate thrpt 30 11883.295 ± 278.079 MB/sec
      Benchmark.of:·gc.alloc.rate.norm thrpt 30 24.000 ± 0.001 B/op
      Benchmark.of:·gc.count thrpt 30 12054.000 counts
      Benchmark.of:·gc.time thrpt 30 6342.000 ms
      Benchmark.singletonList thrpt 30 62301.767 ± 909.762 ops/ms
      Benchmark.singletonList:·gc.alloc.rate thrpt 30 1425.962 ± 20.823 MB/sec
      Benchmark.singletonList:·gc.alloc.rate.norm thrpt 30 24.000 ± 0.001 B/op
      Benchmark.singletonList:·gc.count thrpt 30 2118.000 counts
      Benchmark.singletonList:·gc.time thrpt 30 1833.000 msBenchmark Mode Cnt Score Error Units
      Benchmark.asList thrpt 30 292303.175 ± 2441.718 ops/ms
      Benchmark.asList:·gc.alloc.rate thrpt 30 13380.515 ± 111.773 MB/sec
      Benchmark.asList:·gc.alloc.rate.norm thrpt 30 48.000 ± 0.001 B/op
      Benchmark.asList:·gc.count thrpt 30 12866.000 counts
      Benchmark.asList:·gc.time thrpt 30 6820.000 ms
      Benchmark.of thrpt 30 519191.871 ± 12149.670 ops/ms
      Benchmark.of:·gc.alloc.rate thrpt 30 11883.295 ± 278.079 MB/sec
      Benchmark.of:·gc.alloc.rate.norm thrpt 30 24.000 ± 0.001 B/op
      Benchmark.of:·gc.count thrpt 30 12054.000 counts
      Benchmark.of:·gc.time thrpt 30 6342.000 ms
      Benchmark.singletonList thrpt 30 62301.767 ± 909.762 ops/ms
      Benchmark.singletonList:·gc.alloc.rate thrpt 30 1425.962 ± 20.823 MB/sec
      Benchmark.singletonList:·gc.alloc.rate.norm thrpt 30 24.000 ± 0.001 B/op
      Benchmark.singletonList:·gc.count thrpt 30 2118.000 counts
      Benchmark.singletonList:·gc.time thrpt 30 1833.000 ms
      ```

      According to this benchmark Collections::singletonList is ~8x slower than List::of.


      STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
      ```
      package brutus;


      import java.util.Arrays;
      import java.util.Collections;
      import java.util.List;
      import java.util.concurrent.TimeUnit;
      import org.openjdk.jmh.annotations.BenchmarkMode;
      import org.openjdk.jmh.annotations.Fork;
      import org.openjdk.jmh.annotations.Measurement;
      import org.openjdk.jmh.annotations.Mode;
      import org.openjdk.jmh.annotations.OutputTimeUnit;
      import org.openjdk.jmh.annotations.Scope;
      import org.openjdk.jmh.annotations.State;
      import org.openjdk.jmh.annotations.Warmup;
      import org.openjdk.jmh.infra.Blackhole;

      @Fork(3)
      @Warmup(iterations = 5, time = 20, timeUnit = TimeUnit.SECONDS)
      @Measurement(iterations = 10, time = 30, timeUnit = TimeUnit.SECONDS)
      @BenchmarkMode(Mode.Throughput)
      @OutputTimeUnit(TimeUnit.MILLISECONDS)
      @State(Scope.Benchmark)
      public class Benchmark {

        private static final Object THING = new Object();

        @org.openjdk.jmh.annotations.Benchmark
        public void of(final Blackhole blackhole) {
          blackhole.consume(List.of(THING));
        }

        @org.openjdk.jmh.annotations.Benchmark
        public void asList(final Blackhole blackhole) {
          blackhole.consume(Arrays.asList(THING));
        }

        @org.openjdk.jmh.annotations.Benchmark
        public void singletonList(final Blackhole blackhole) {
          blackhole.consume(Collections.singletonList(THING));
        }
      }
      ```

      EXPECTED VERSUS ACTUAL BEHAVIOR :
      EXPECTED -
      For one item lists, Arrays::asList is slower than Collections::singletonList and List::of and the later two are approximately as performant as eachother on x64 and ARM64
      ACTUAL -
      Collections::singletonList is ~4x slower than Arrays::asList and ~8x slower than List::of on ARM64

      ---------- BEGIN SOURCE ----------
      package brutus;


      import java.util.Arrays;
      import java.util.Collections;
      import java.util.List;
      import java.util.concurrent.TimeUnit;
      import org.openjdk.jmh.annotations.BenchmarkMode;
      import org.openjdk.jmh.annotations.Fork;
      import org.openjdk.jmh.annotations.Measurement;
      import org.openjdk.jmh.annotations.Mode;
      import org.openjdk.jmh.annotations.OutputTimeUnit;
      import org.openjdk.jmh.annotations.Scope;
      import org.openjdk.jmh.annotations.State;
      import org.openjdk.jmh.annotations.Warmup;
      import org.openjdk.jmh.infra.Blackhole;

      @Fork(3)
      @Warmup(iterations = 5, time = 20, timeUnit = TimeUnit.SECONDS)
      @Measurement(iterations = 10, time = 30, timeUnit = TimeUnit.SECONDS)
      @BenchmarkMode(Mode.Throughput)
      @OutputTimeUnit(TimeUnit.MILLISECONDS)
      @State(Scope.Benchmark)
      public class Benchmark {

        private static final Object THING = new Object();

        @org.openjdk.jmh.annotations.Benchmark
        public void of(final Blackhole blackhole) {
          blackhole.consume(List.of(THING));
        }

        @org.openjdk.jmh.annotations.Benchmark
        public void asList(final Blackhole blackhole) {
          blackhole.consume(Arrays.asList(THING));
        }

        @org.openjdk.jmh.annotations.Benchmark
        public void singletonList(final Blackhole blackhole) {
          blackhole.consume(Collections.singletonList(THING));
        }
      }

      ---------- END SOURCE ----------

            Assignee:
            Stuart Marks
            Reporter:
            Webbug Group
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated: