-
Enhancement
-
Resolution: Fixed
-
P4
-
20
-
b26
-
riscv
-
linux
Issue | Fix Version | Assignee | Priority | Status | Resolution | Resolved In Build |
---|---|---|---|---|---|---|
JDK-8311739 | 17.0.9 | Fei Yang | P4 | Resolved | Fixed | b01 |
Currently, in C2, Math.min/max is implemented in c2_MacroAssembler_riscv.cpp using
void C2_MacroAssembler::minmax_FD(FloatRegister dst, FloatRegister src1, FloatRegister src2, bool is_double, bool is_min)
The main issue there is Min/Max is required to return NaN if any of its arguments is NaN. In risc-v, fmin/fmax returns NaN only if both of src registers is NaN ( quiet NaN).
That requires additional logic to handle the case where only of of src is NaN.
Currently it’s done this way ( i’ve reduced is_double and is_min case for readability)
fmax_s(dst, src1, src2);
// Checking NaNs
flt_s(zr, src1, src2);
frflags(t0);
beqz(t0, Done);
// In case of NaNs
fadd_s(dst, src1, src2);
bind(Done);
here we always do two float comparisons ( one in fmax, one in flt), perf shows they are taking equal time ( checking on thead c910)
I think that’s suboptimal and can be improved: first, move the check before fmin/fmax and if check fails return NaN without doing fmax.
More - https://mail.openjdk.org/pipermail/riscv-port-dev/2022-November/000676.html
void C2_MacroAssembler::minmax_FD(FloatRegister dst, FloatRegister src1, FloatRegister src2, bool is_double, bool is_min)
The main issue there is Min/Max is required to return NaN if any of its arguments is NaN. In risc-v, fmin/fmax returns NaN only if both of src registers is NaN ( quiet NaN).
That requires additional logic to handle the case where only of of src is NaN.
Currently it’s done this way ( i’ve reduced is_double and is_min case for readability)
fmax_s(dst, src1, src2);
// Checking NaNs
flt_s(zr, src1, src2);
frflags(t0);
beqz(t0, Done);
// In case of NaNs
fadd_s(dst, src1, src2);
bind(Done);
here we always do two float comparisons ( one in fmax, one in flt), perf shows they are taking equal time ( checking on thead c910)
I think that’s suboptimal and can be improved: first, move the check before fmin/fmax and if check fails return NaN without doing fmax.
More - https://mail.openjdk.org/pipermail/riscv-port-dev/2022-November/000676.html
- backported by
-
JDK-8311739 RISC-V: improve performance of floating Max Min intrinsics
- Resolved
- links to
-
Commit openjdk/jdk17u-dev/966fc82d
-
Commit openjdk/jdk/99d3840d
-
Commit openjdk/riscv-port-jdk17u/54a78c77
-
Review openjdk/jdk17u-dev/1427
-
Review openjdk/jdk/11276
-
Review openjdk/jdk/11327
-
Review openjdk/riscv-port-jdk17u/49
(3 links to)