Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-6249015

REG: performance regression for 2D VIS loops in mustang-b28

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Duplicate
    • Icon: P3 P3
    • None
    • 6
    • client-libs
    • 2d
    • sparc
    • solaris_9

      I've been looking into some strange performance issues for
      our VISified rendering loops. I finally found that the regressions
      were introduced in b28 with the "isolate compiler optimizations"
      changes (see 6228665).

      Digging deeper, I found this in Defs-solaris.gmk:
        # Highest could be -xO5, but indications are that -xO5 should be reserved
        # for a per-file use, on sources with known performance impacts.
        ifeq ($(COMPILER_REV), 5.7)
          # Fixup when 5.7 compiler bugs fixed sparc and x86 (6237550 6237514)
          CC_HIGHEST_OPT = -xO3
          CC_HIGHER_OPT = -xO3
        else
          CC_HIGHEST_OPT = -xO4
          CC_HIGHER_OPT = -xO4
        endif

      So the change between b27 and b28 is that we use -xO3 instead of
      -xO4 when compiling our VIS loops. It looks like the two bugs
      referenced there have been fixed recently in the compilers, but
      I'm not sure when those fixes will be available for us (so that
      we can revert back to -xO4, possibly).

      Anyway, in my personal workspace (in sync with master), I tried
      compiling our VIS loops with -xO4 instead of -xO3. Here is
      a sampling of some of the results I'm seeing in J2DBench (copying
      from the specified source image to an IntArgbPre destination):

      graphics.imaging.src=IntArgb translucent,graphics.opts.sizes=250:
      5.0: 71593.18637 (var=0.6%) (100.0%)
      b27: 75491.85214 (var=5.45%) (105.45%)
      b28: 33905.06581 (var=5.52%) (47.36%)
      b29.xO4: 75355.14205 (var=0.81%) (105.25%)
      graphics.imaging.src=IntXrgb opaque,graphics.opts.sizes=250:
      5.0: 235551.75191 (var=0.93%) (100.0%)
      b27: 222731.85483 (var=1.27%) (94.56%)
      b28: 205799.43616 (var=1.53%) (87.37%)
      b29.xO4: 70202.37615 (var=0.56%) (29.8%)

      So you'll see for IntArgb in b28 we took a big hit (~53% slowdown
      compared to b27). We got back our original performance by using
      -xO4 again. But note the strange results for IntXrgb... Using
      -xO4 made us even worse?!?!

      Clearly we have to look into this further. I've only scratched
      the surface by looking at 5 or 6 different tests, but the results
      are all over the board depending on the compilers and compiler
      options being used. I looked at which compiler options were
      used to build vis_IntArgbPre.c in both b27 and b28 to see if
      there were any differences. I cleaned up the lists, sorted them,
      and diffed them (the results are in the comments section).
      ###@###.### 2005-03-31 23:12:40 GMT

            flar Jim Graham
            campbell Christopher Campbell (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved:
              Imported:
              Indexed: