-
Bug
-
Resolution: Duplicate
-
P3
-
None
-
6
-
sparc
-
solaris_9
I've been looking into some strange performance issues for
our VISified rendering loops. I finally found that the regressions
were introduced in b28 with the "isolate compiler optimizations"
changes (see 6228665).
Digging deeper, I found this in Defs-solaris.gmk:
# Highest could be -xO5, but indications are that -xO5 should be reserved
# for a per-file use, on sources with known performance impacts.
ifeq ($(COMPILER_REV), 5.7)
# Fixup when 5.7 compiler bugs fixed sparc and x86 (6237550 6237514)
CC_HIGHEST_OPT = -xO3
CC_HIGHER_OPT = -xO3
else
CC_HIGHEST_OPT = -xO4
CC_HIGHER_OPT = -xO4
endif
So the change between b27 and b28 is that we use -xO3 instead of
-xO4 when compiling our VIS loops. It looks like the two bugs
referenced there have been fixed recently in the compilers, but
I'm not sure when those fixes will be available for us (so that
we can revert back to -xO4, possibly).
Anyway, in my personal workspace (in sync with master), I tried
compiling our VIS loops with -xO4 instead of -xO3. Here is
a sampling of some of the results I'm seeing in J2DBench (copying
from the specified source image to an IntArgbPre destination):
graphics.imaging.src=IntArgb translucent,graphics.opts.sizes=250:
5.0: 71593.18637 (var=0.6%) (100.0%)
b27: 75491.85214 (var=5.45%) (105.45%)
b28: 33905.06581 (var=5.52%) (47.36%)
b29.xO4: 75355.14205 (var=0.81%) (105.25%)
graphics.imaging.src=IntXrgb opaque,graphics.opts.sizes=250:
5.0: 235551.75191 (var=0.93%) (100.0%)
b27: 222731.85483 (var=1.27%) (94.56%)
b28: 205799.43616 (var=1.53%) (87.37%)
b29.xO4: 70202.37615 (var=0.56%) (29.8%)
So you'll see for IntArgb in b28 we took a big hit (~53% slowdown
compared to b27). We got back our original performance by using
-xO4 again. But note the strange results for IntXrgb... Using
-xO4 made us even worse?!?!
Clearly we have to look into this further. I've only scratched
the surface by looking at 5 or 6 different tests, but the results
are all over the board depending on the compilers and compiler
options being used. I looked at which compiler options were
used to build vis_IntArgbPre.c in both b27 and b28 to see if
there were any differences. I cleaned up the lists, sorted them,
and diffed them (the results are in the comments section).
###@###.### 2005-03-31 23:12:40 GMT
our VISified rendering loops. I finally found that the regressions
were introduced in b28 with the "isolate compiler optimizations"
changes (see 6228665).
Digging deeper, I found this in Defs-solaris.gmk:
# Highest could be -xO5, but indications are that -xO5 should be reserved
# for a per-file use, on sources with known performance impacts.
ifeq ($(COMPILER_REV), 5.7)
# Fixup when 5.7 compiler bugs fixed sparc and x86 (6237550 6237514)
CC_HIGHEST_OPT = -xO3
CC_HIGHER_OPT = -xO3
else
CC_HIGHEST_OPT = -xO4
CC_HIGHER_OPT = -xO4
endif
So the change between b27 and b28 is that we use -xO3 instead of
-xO4 when compiling our VIS loops. It looks like the two bugs
referenced there have been fixed recently in the compilers, but
I'm not sure when those fixes will be available for us (so that
we can revert back to -xO4, possibly).
Anyway, in my personal workspace (in sync with master), I tried
compiling our VIS loops with -xO4 instead of -xO3. Here is
a sampling of some of the results I'm seeing in J2DBench (copying
from the specified source image to an IntArgbPre destination):
graphics.imaging.src=IntArgb translucent,graphics.opts.sizes=250:
5.0: 71593.18637 (var=0.6%) (100.0%)
b27: 75491.85214 (var=5.45%) (105.45%)
b28: 33905.06581 (var=5.52%) (47.36%)
b29.xO4: 75355.14205 (var=0.81%) (105.25%)
graphics.imaging.src=IntXrgb opaque,graphics.opts.sizes=250:
5.0: 235551.75191 (var=0.93%) (100.0%)
b27: 222731.85483 (var=1.27%) (94.56%)
b28: 205799.43616 (var=1.53%) (87.37%)
b29.xO4: 70202.37615 (var=0.56%) (29.8%)
So you'll see for IntArgb in b28 we took a big hit (~53% slowdown
compared to b27). We got back our original performance by using
-xO4 again. But note the strange results for IntXrgb... Using
-xO4 made us even worse?!?!
Clearly we have to look into this further. I've only scratched
the surface by looking at 5 or 6 different tests, but the results
are all over the board depending on the compilers and compiler
options being used. I looked at which compiler options were
used to build vis_IntArgbPre.c in both b27 and b28 to see if
there were any differences. I cleaned up the lists, sorted them,
and diffed them (the results are in the comments section).
###@###.### 2005-03-31 23:12:40 GMT
- duplicates
-
JDK-6236988 Increase default Solaris SS10 optimization level on PRODUCT=java libraries
- Resolved
- relates to
-
JDK-6228665 Isolate all C/C++ compiler optimization options to j2se/make/common/Defs* files
- Resolved