Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8106483

Hand tune the Java and SSE based GaussianBlur filter implementations

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: P3 P3
    • fx1.1
    • fx1.0
    • javafx

      The Gaussianblur filter is used in many common situations, including a component of the Bloom and Glow effects. Right now any device without hardware (GPU) acceleration will use either a native compiled (SSE enabled) function or a Java method that was written by the automated JSL compiler. The code is surprisingly fast for code that was derived from a Shader source language, but it isn't as fast as it could be by rewriting the inner loops by hand.

      Further, there is an interesting problem with the automatically generated SSE and Java loops - the Java loops are actually faster than the SSE loops

      A quick hand-tuning of the inner loops of the Java and SSE backends for GaussianBlur show anywhere from about a 3x to about an 11x speedup.

      Here is a quick table of the results:

      GaussianBlurTest
                            radius 10 radius 63

      JSL Client Java 135ms 1540ms
      hand Client Java 45ms 347ms

      JSL Server Java 120ms 1210ms
      hand Server Java 35ms 240ms

      JSL SSE 234ms 2770ms
      hand SSE 32ms 235ms

      D3D 7ms 24ms

      ---------------------------

      BloomTest (fixed radius = 10)

      JSL Client Java 150-200ms
      hand Client Java 60ms

      JSL Server Java 125-200ms
      hand Server Java 47-50ms

      JSL SSE 225ms
      hand SSE 40ms

      D3D 5ms

      ---------------------------

      GlowTest (fixed radius = 10)

      JSL Client Java 220ms
      hand Client Java 68ms

      JSL Server Java 232ms
      hand Server Java 55ms

      JSL SSE 278ms
      hand SSE 48ms

      D3D 4ms

            flar Jim Graham
            flar Jim Graham
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated:
              Resolved:
              Imported: