Loading...

XML

Word

Printable

Type: Enhancement
Resolution: Unresolved
Priority: P3
Fix Version/s: tbd
Affects Version/s: 5.0, 9, 10
Component/s: hotspot
Labels:
- c2
- c2-cg
- performance
- wnf-candidate

Subcomponent:
compiler
CPU:

generic
OS:

generic

Generally, array address expressions are reassociated so that
small constants are added in last. This is good for expressions
involving the loop induction variable because it's likely that
loop unrolling will createexpressions like:
       (array_base + (i<<2)) + 12 # 12 bytes of object header
       (array_base + (i<<2)) + 16
       (array_base + (i<<2)) + 20

But this may not be the best when the index expression
does not include the induction variable. Loop unrolling
may create:
        (array_base + exp0) + 12
        (array_base + exp1) + 12
        (array_base + exp2) + 12
which could be reassociated as:
        derived_base = array_base + 12
        derived_base + exp0
        derived_base + exp1
        derived_base + exp2
Which can use the 2 register form of
a memory instruction to perform the add.

In the following test case:

Should use a derived pointer for R_L0+#12
and then incorporate Add into LDUW addressing (2 register version)

     ADD R_L6,R_L0,R_L0
     LDUW [R_L0 + #12],R_L0

  becomes

     LDUW [R_L6, R_L0],R_L0 # where R_L6 is derived: "+ #12"

01 public class CRC32 implements Checksum {
02 private static final int[] crc_table = {...}; /* 256 constants */
03
04 /* Rolled, use "unsigned byte idiom" */
05 private static int updateBytes(int crc, byte[] b,
06 final int off, final int len) {
07 final int[] table = crc_table;
08 final int limit = off + len;
09 crc = crc ^ 0xffffffff;
10 for (int i = off; i < limit; i++) {
11 crc = table[(crc ^ ((int)b[i] & 0xff)) & 0xff]
12 ^ (crc >>> 8);
13 }
14 return crc ^ 0xffffffff;
15 }
16 }

From Mike Paleczny:

Yes, the normal canonicalization which moves or keeps small
constants near the root of each AddP expression tree doesn't
do the best thing for:

(AddP1 (AddP2 L6 SLL_1) 12)
(AddP1 (AddP2 L6 SLL_2) 12)
(AddP1 (AddP2 L6 SLL_3) 12)
...

I think this is an instance of the general "reassociation" problem.
That is, swapping the ordering will result in the nonchanging
derived pointer:

(AddP1 (AddP2 L6 12) SLL_1)
(AddP1 (AddP2 L6 12) SLL_2)
(AddP1 (AddP2 L6 12) SLL_3)
...

Once reassociated, all the (AddP2 L6 12) expressions
will fold together.
###@###.### 2005-1-13 18:07:14 GMT

relates to

JDK-8154826 AArch64: take better advantage of base + shifted offset addressing mode

Resolved

Assignee:: Unassigned

Reporter:: Ross Knippel (Inactive)

Votes:: 1 Vote for this issue

Watchers:: 3 Start watching this issue

Created:: 2005-01-13 10:07

Updated:: 2018-12-07 02:35

Imported:: 17/Sep/12 12:23 AM

Indexed:: 19/Jul/12 5:53 PM

Details

Description

Attachments

Issue Links

Activity

People

Dates