Issue | Fix Version | Assignee | Priority | Status | Resolution | Resolved In Build |
---|---|---|---|---|---|---|
JDK-2188213 | 7 | Tom Rodriguez | P3 | Closed | Fixed | b83 |
JDK-2189824 | 6u21 | Tom Rodriguez | P3 | Resolved | Fixed | b01 |
Hi Tom, Christian, and others,
Here's a patch I'd like to contribute:
http://cr.openjdk.java.net/~rasbold/69XXXXX/webrev.00/
With it, C2 generates shorter long multiplication sequences on x86_32
when the high 32 bits are known to be zero.
Particularly, this applies to the loop in BigInteger.mulAdd():
private final static long LONG_MASK = 0xffffffffL;
static int mulAdd(int[] out, int[] in, int offset, int len, int k) {
long kLong = k & LONG_MASK;
long carry = 0;
offset = out.length-offset - 1;
for (int j=len-1; j >= 0; j--) {
long product = (in[j] & LONG_MASK) * kLong +
(out[offset] & LONG_MASK) + carry;
out[offset--] = (int)product;
carry = product >>> 32;
}
return (int)carry;
}
In my measurements, one of our internal microbenchmarks that uses
BigInteger.mulAdd sped up about 12%. Also, SPECjvm2008's crypto.rsa
and crypto.signverify improved about 7% and 2.3%, respectively.
Here's a patch I'd like to contribute:
http://cr.openjdk.java.net/~rasbold/69XXXXX/webrev.00/
With it, C2 generates shorter long multiplication sequences on x86_32
when the high 32 bits are known to be zero.
Particularly, this applies to the loop in BigInteger.mulAdd():
private final static long LONG_MASK = 0xffffffffL;
static int mulAdd(int[] out, int[] in, int offset, int len, int k) {
long kLong = k & LONG_MASK;
long carry = 0;
offset = out.length-offset - 1;
for (int j=len-1; j >= 0; j--) {
long product = (in[j] & LONG_MASK) * kLong +
(out[offset] & LONG_MASK) + carry;
out[offset--] = (int)product;
carry = product >>> 32;
}
return (int)carry;
}
In my measurements, one of our internal microbenchmarks that uses
BigInteger.mulAdd sped up about 12%. Also, SPECjvm2008's crypto.rsa
and crypto.signverify improved about 7% and 2.3%, respectively.
- backported by
-
JDK-2189824 optimize 64 long multiply for case with high bits zero
-
- Resolved
-
-
JDK-2188213 optimize 64 long multiply for case with high bits zero
-
- Closed
-