Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-6941126

Obscure JIT optimizations of immutable algorithms - costly?

XMLWordPrintable

    • Icon: Enhancement Enhancement
    • Resolution: Unresolved
    • Icon: P4 P4
    • tbd
    • 6u19, 9, 10
    • hotspot
    • generic
    • generic

      A DESCRIPTION OF THE REQUEST :
      Running different versions of a simple loop, comparing execution time of local variables, object variables, get-method invocations, constants etc. reveal that loops using only local variables and constants execute slower than loops involving non-local, non-constant variables. This is the case also in Java 1.5.

      Somewhere between jdk1.6.0_05 and jdk1.6.0_17, loops using only local variables and constants will execute as mentioned above the first two times the loop is executed. From the third execution, execution time is significantly reduced (factor 15-20), thus executing faster than the processor can even make the number of iterations defining the loop, but far too slow to indicate a lookup of some cached processing result.

      If loops (or other code fragments) consist only of local variables and constants - constant algorithms (no parameters, no method calls, no non-local variables that are not final), then it must follow that the process will not vary from one execution to the next. Thus, thinking programmers will make sure it is executed only once, placing the result(s) in variables if needed for later access. It stands to reason that the optimization efforts have some running time costs (could that account for the slower execution time?), which would constitute an overall cost to constant algorithms, granting that those are not executed repeatedly in a program (which, as argued above, would not be clever programming).

      Of course, the required analysis leading to the optimization may be swift, and so constitute no great sacrifice. And there may be a perfectly valid justification for slower execution of constant algorithms (or the first two executions thereof since the optimization was implemented)...

      JUSTIFICATION :
      Even if novice programmers often create less than optimal programs, where the optimization in question may be of value, it would be a shame to have that encumbering well-formed programs in which the optimization efforts will never succeed.

      EXPECTED VERSUS ACTUAL BEHAVIOR :
      EXPECTED -
      1) testNoLocals - loop using only object variables
      2) testLocalVar - loop using only local variables
      3) testSemiLocals - loop using both local variables and object variables

      I would expect 2) to execute no slower than 1) and 3)
      ACTUAL -
      With java version "1.5.0_06"
      Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_06-b05)
      Java HotSpot(TM) Server VM (build 1.5.0_06-b05, mixed mode)

      testNoLocals executes a little bit slower than testLocalVar, while testSemiLocals executes almost twice as fast. After two executions, execution time of testLocalVar drops to only a bit slower than testSemiLocals (early optimization version?)

      With java version "1.6.0_05"
      Java(TM) SE Runtime Environment (build 1.6.0_05-b13)
      Java HotSpot(TM) Server VM (build 10.0-b19, mixed mode)

      testNoLocals executes faster than the other two, while testSemiLocals has become somewhat slower than with Java1.5. testLocalVar is 60-90% slower than testNoLocals in the first two executions, after which execution time drops to equidistant between testNoLocals and testSemiLocals.

      With java version "1.6.0_18"
      Java(TM) SE Runtime Environment (build 1.6.0_18-b07)
      Java HotSpot(TM) 64-Bit Server VM (build 16.0-b13, mixed mode)

      testNoLocals executes even faster, while testLocalVar and testSemiLocals have the same execution time (ca. 20% slower than testNoLocals) in the first two executions, after which execution time of testLocalVar drops to ca. 6%

      ---------- BEGIN SOURCE ----------
      import static java.lang.System.*;

      public class Test2
      {
      public final static Test2 test = new Test2( 2 );

      private static void testNoLocals()
      {
      long start = currentTimeMillis();
      for ( test.ix=test.sum=0 ; test.ix<test.moreTestCycles ; test.ix++ )
      if ( test.ix < test.moreTestCycles )
      test.sum += test.objvar();
      long lapse = currentTimeMillis() - start;
      out.println( "Elapsed (no locals sum=" + test.sum + "): " + lapse );

      } //End testNoLocals()

      private static void testSemiLocals()
      {
      int ix, sum;
      long start = currentTimeMillis();
      for ( ix=sum=0 ; ix<test.moreTestCycles ; ix++ )
      if ( ix < test.moreTestCycles )
      sum += test.objvar();
      long lapse = currentTimeMillis() - start;
      out.println( "Elapsed (semi locals sum=" + sum + "): " + lapse );

      } //End testSemiLocals()

      private static void testLocalVar()
      {
      int sum, ix, localvar = 2, moreTestCycles = 5000000 << 6;
      long start = currentTimeMillis();
      for ( ix=sum=0 ; ix<moreTestCycles ; ix++ )
      if ( ix < moreTestCycles )
      sum += localvar;
      long lapse = currentTimeMillis() - start;
      out.println( "Elapsed (local vars sum=" + sum + "): " + lapse );

      } //End testLocalVar()

      public static void main( String[] args )
      {
      for ( int i=0 ; i<10 ; i++ )
      {
      testNoLocals();
      testLocalVar();
      testSemiLocals();
      out.println();
      }

      } //End main( String[] )

      public int sum;

      public int objvar;

      public int ix;

      public int moreTestCycles = 5000000 << 6;

      public Test2( int val ) { this.objvar = val; }

      public int objvar() { return objvar; }

      } //End class Test2

      ---------- END SOURCE ----------

      CUSTOMER SUBMITTED WORKAROUND :
      As it seems that algorithm fragments using only local variables and constants are always among the slowest to execute (unless executed more than twice, running on a recent java1.6 version), it actually gives a payoff in execution time if one or more local variables or constants are replaced with object variables or get-method invocations (get-method invocations are as fast as accessing an object variable directly).

            Unassigned Unassigned
            ndcosta Nelson Dcosta (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated:
              Imported:
              Indexed: