Loading...

XML

Word

Printable

Type: Enhancement
Resolution: Fixed
Priority: P3
Fix Version/s: hs17
Affects Version/s: hs17
Component/s: hotspot
Labels:
None

Subcomponent:
compiler
Resolved In Build:
b09
CPU:

sparc
OS:

solaris_9

Issue	Fix Version	Assignee	Priority	Status	Resolution	Resolved In Build
JDK-2188213	7	Tom Rodriguez	P3	Closed	Fixed	b83
JDK-2189824	6u21	Tom Rodriguez	P3	Resolved	Fixed	b01

Hi Tom, Christian, and others,

Here's a patch I'd like to contribute:
http://cr.openjdk.java.net/~rasbold/69XXXXX/webrev.00/

With it, C2 generates shorter long multiplication sequences on x86_32
when the high 32 bits are known to be zero.

Particularly, this applies to the loop in BigInteger.mulAdd():

   private final static long LONG_MASK = 0xffffffffL;

   static int mulAdd(int[] out, int[] in, int offset, int len, int k) {
       long kLong = k & LONG_MASK;
       long carry = 0;

       offset = out.length-offset - 1;
       for (int j=len-1; j >= 0; j--) {
           long product = (in[j] & LONG_MASK) * kLong +
                          (out[offset] & LONG_MASK) + carry;
           out[offset--] = (int)product;
           carry = product >>> 32;
       }
       return (int)carry;
   }

In my measurements, one of our internal microbenchmarks that uses
BigInteger.mulAdd sped up about 12%. Also, SPECjvm2008's crypto.rsa
and crypto.signverify improved about 7% and 2.3%, respectively.

backported by

JDK-2189824 optimize 64 long multiply for case with high bits zero

Resolved

JDK-2188213 optimize 64 long multiply for case with high bits zero

Closed

Assignee:: Tom Rodriguez
Reporter:: Tom Rodriguez
Votes:: 0 Vote for this issue
Watchers:: 0 Start watching this issue

Created:: 2010-02-01 15:14
Updated:: 2010-04-02 15:29
Resolved:: 2010-02-09 12:32
Imported:: 17/Sep/12 12:57 AM
Indexed:: 19/Jul/12 6:09 PM

Details

Backports

Description

Attachments

Issue Links

Activity

People

Dates