Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8346964

C2: Improve integer multiplication with constant in MulINode::Ideal()

XMLWordPrintable

    • Icon: Enhancement Enhancement
    • Resolution: Unresolved
    • Icon: P4 P4
    • tbd
    • 25
    • hotspot

      Constant multiplication `x*C` can be optimized to cheaper IRs like add or shift. For example:
      1. x*8 can be optimized as x<<3.
      2. x*9 can be optimized as x+x<<3, and x+x<<3 can be lowered as one ADD-SHIFT instruction on some architectures, like aarch64 and x86_64.

      Currently C2 implemented a few such patterns in mid-end, including:
      1. |C| = 1<<n (n>0)
      2. |C| = (1<<n) - 1 (n>0)
      3. |C| = (1<<m) + (1<<n) (m>n, n>=0)

      The first two are ok. Because on most architectures they are lowered as
      only one ADD/SUB/SHIFT instruction.

      But the third pattern doesn't always perform well on some architectures like AArch64. According to the Arm optimization guide, if the shift amount > 4, the latency and throughput of ADD instruction is the same with MUL instruction. In this case, converting MUL to ADD is not profitable. Hence, adding such transformation in mid-end IR level may get performance regression for some cases.

            xgong Xiaohong Gong
            xgong Xiaohong Gong
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated: