Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8300669

AArch64: Table based tails processing and wider stores for Arrays.fill() intrinsic

XMLWordPrintable

    • Icon: Enhancement Enhancement
    • Resolution: Unresolved
    • Icon: P4 P4
    • tbd
    • 21
    • hotspot
    • aarch64
    • generic

      Experiments show that fill() implementations suffer from having many branches. That can be improved with following ideas:

      1. Wider stores in the main case (aligned and long enough target).

      Current implementation uses STP for the main loop. We can start using SIMD stores (https://bugs.openjdk.org/browse/JDK-8268233) but it is more effective than wide GPR stores only on some micro-architectures, GPR variant can be wider as wll. Choice of main store variant can be made by a run time flag. Experiments show that current SVE implementations show no benefit over SIMD stores but if that changes it makes sense to have an implementation that can be further extended with SVE store variant.

      2. Align wider stores to a larger boundary.

      3. Table based implementation for alignment and processing tails.

      As fill() implementations are generated a stubs, for small enough lengths there may be sub-routines that fill arrays exactly of that length without branches. Trade-off here is additional code wasted for such duplicated implementations. However experiments show that it is moderate.

        1. ArraysFill.java
          7 kB
          Dmitry Chuyko
        2. arrays-fill.ods
          122 kB
          Dmitry Chuyko

            dchuyko Dmitry Chuyko
            dchuyko Dmitry Chuyko
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated: