Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8345245

C2 SuperWord: further improve latency after PhaseIdealLoop::move_unordered_reduction_out_of_loop

XMLWordPrintable

    • Icon: Enhancement Enhancement
    • Resolution: Unresolved
    • Icon: P4 P4
    • tbd
    • 24
    • hotspot

      [~qamai] had this idea, I'm filing it for him.

      When we vectorize reductions, we try to move them out of the loop, see PhaseIdealLoop::move_unordered_reduction_out_of_loop introduced in JDK-8302652 / https://github.com/openjdk/jdk/pull/13056.

      That still leaves us with a chain of vector-adds, which can limit the latency. I'm copying this from elsewhere:

      [~qamai]:
      Reassociation idea: Reduction loop is latency-bound, so we can reassociate the operations of an unrolled loop to saturate the ALU and load/store units. E.g: transforming x4 + (x3 + (x2 + (x1 + x))) into x + (x4 + (x3 + (x2 + x1))). This should be easier and introduce less register pressure compared to having several dedicated reduction lanes.
      [~epeter]:
      Ok, yes. After moving the reduction out of the loop, we now have add-vectors in a sequence.
      This has high latency. We could further improve things this way:
      - give each its own phi -> smaller latency but requires more registers
      - reassociate them -> if we do it right, i.e. xv = xv + (xv4 + (xv3 + (xv2 + xv1))), then the latency is still minimal, but the register pressure on the backedge is smaller. Nice idea!

      I may soon refactor away PhaseIdealLoop::move_unordered_reduction_out_of_loop, and move it into VLoop::optimize, so we can already predict during auto-vectoirzation if we can move the reduction nodes out of the loop, which makes vectorization more profitable.

      So this optimization would have to be a stand-alone. Maybe it could be done in in IGVN after loop-opts, when we are done super-unrolling.

      It would require that we find a benchmark where the reduction latency is the bottleneck, and not any other computation or memory operation.

            epeter Emanuel Peter
            epeter Emanuel Peter
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated: