MemorySegment::fill performance deficit on aarch64

XMLWordPrintable

    • Type: Bug
    • Resolution: Duplicate
    • Priority: P4
    • tbd
    • Affects Version/s: 22, 23, 24
    • Component/s: core-libs

      The performance for `MemorySegment::fill` is an order of magnitude slower on aarch64 platforms than on x64 platforms.

      Once this has been fixed, `AbstractMemorySegmentImpl.java.FILL_NATIVE_THRESHOLD` should also be updated.


      Benchmark results (bytes per operation in the header and figures are ns per operation):

      0 1 2 3 4 5 6 7 8 16 21 32 64
          Linux a64 3.4 40.5 40.0 40.0 39.0 40.0 41.0 40.0 39.0 39.0 41.0 39.0 40.0
          Linux x64 2.2 5.7 4.9 5.7 4.6 5.7 6.6 5.7 4.5 4.9 5.7 6.2 4.8
          macOS a64 1.6 52.0 52.0 54.0 52.0 55.0 54.0 55.0 53.0 53.0 56.0 54.0 53.0
          macOS x64 2.0 7.2 5.2 7.1 4.9 6.9 5.5 6.9 4.2 4.5 8.6 5.0 5.2
          Windows x64 1.9 4.5 3.9 4.5 3.5 5.4 5.1 4.5 3.4 3.7 4.5 4.8 3.7

      @BenchmarkMode(Mode.AverageTime)
      @Warmup(iterations = 5, time = 500, timeUnit = TimeUnit.MILLISECONDS)
      @Measurement(iterations = 10, time = 500, timeUnit = TimeUnit.MILLISECONDS)
      @State(Scope.Thread)
      @OutputTimeUnit(TimeUnit.NANOSECONDS)
      @Fork(value = 3)
      public class TestFill {

          @Param({"0", "1", "2", "3", "4", "5", "6", "7", "8", "16", "32", "64", "128"})
          public int ELEM_SIZE;

          MemorySegment heapSegment;
        
          @Setup
          public void setup() {
              var array = new byte[ELEM_SIZE];
              heapSegment = MemorySegment.ofArray(array);
           }

           @Benchmark
          public void heap_segment_fill() {
              heapSegment.fill((byte) 0);
          }

       
      }


            Assignee:
            Johan Sjölen
            Reporter:
            Per-Ake Minborg
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: