Uploaded image for project: 'Code Tools'
  1. Code Tools
  2. CODETOOLS-7902942

Elapsed time of MainAction is including serialization wait time

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: P4 P4
    • jtreg6.1
    • None
    • tools
    • None

      If you run othervm tests with a concurrency > 1, then the reported test times get inflated. You can see this by running:
      time makec ../build/fastdebug test TEST=test/hotspot/jtreg/vmTestbase/vm/gc/compact JTREG="VERBOSE=all"

      and looking at the elapsed time for the main action:
      $ grep -r "othervm specified" -A1 jtreg_open_test_hotspot_jtreg_vmTestbase_vm_gc_compact/jtreg.log | grep elapsed
      elapsed time (seconds): 121.194
      elapsed time (seconds): 240.631
      elapsed time (seconds): 359.092
      elapsed time (seconds): 478.718
      elapsed time (seconds): 597.433
      elapsed time (seconds): 716.628
      elapsed time (seconds): 836.638
      elapsed time (seconds): 955.156
      elapsed time (seconds): 1074.476
      elapsed time (seconds): 1193.712
      elapsed time (seconds): 1314.116
      elapsed time (seconds): 1433.632
      elapsed time (seconds): 1553.563
      elapsed time (seconds): 1672.996
      elapsed time (seconds): 1792.846
      elapsed time (seconds): 1913.112
      elapsed time (seconds): 1926.819
      elapsed time (seconds): 1926.809
      elapsed time (seconds): 1926.921
      elapsed time (seconds): 1927.093
      elapsed time (seconds): 1927.192
      elapsed time (seconds): 1927.07
      elapsed time (seconds): 1926.896
      elapsed time (seconds): 1927.31
      elapsed time (seconds): 1927.244
      elapsed time (seconds): 1927.33
      elapsed time (seconds): 1927.221
      elapsed time (seconds): 1927.42
      elapsed time (seconds): 1927.158
      elapsed time (seconds): 1927.315
      elapsed time (seconds): 1927.316
      elapsed time (seconds): 1927.311
      elapsed time (seconds): 1927.532
      elapsed time (seconds): 1927.379

      Each test is supposed to be run for 120 seconds, but the reported time ramps up in ~120s increments until it reaches the number that matches the "jtreg concurrency" number, which in my run is 16.

      I think this happens because we start "concurrent" tests. Each test starts by taking a timestamp, then looks if we need "exclusiveAccess" and if we do, then a lock is taken before the othervm test process is forked. This serializes the tests and inflates the times.

      From the jtreg code:
      -----------
      startAction(true); <<< Takes timestamps
      if (script.isCheck()) {
          status = passed(CHECK_PASS);
      } else {
          Lock lock = script.getLockIfRequired(); <<< serializes on lock
          if (lock != null) lock.lock();
          try {
              switch (!othervmOverrideReasons.isEmpty() ? ExecMode.OTHERVM : script.getExecMode()) {
                  case AGENTVM:
                      showMode(ExecMode.AGENTVM);
                      status = runAgentJVM(); <<< starts test
      ----------

      I've tested a workaround for this, and it seems to be solving the problem. However, since I'm not well-versed in this code, I don't know if this have any unforeseen consequences.

      The workaround:
      diff --git a/src/share/classes/com/sun/javatest/regtest/exec/MainAction.java b/src/share/classes/com/sun/javatest/regtest/exec/MainAction.java
      index dc1a6d5..bea9e2a 100644
      --- a/src/share/classes/com/sun/javatest/regtest/exec/MainAction.java
      +++ b/src/share/classes/com/sun/javatest/regtest/exec/MainAction.java
      @@ -333,13 +333,16 @@ public class MainAction extends Action
               if (nativeCode && script.getNativeDir() == null)
                   return error(MAIN_NO_NATIVES);
      - startAction(true);
      -
               if (script.isCheck()) {
      + startAction(true);
                   status = passed(CHECK_PASS);
               } else {
                   Lock lock = script.getLockIfRequired();
                   if (lock != null) lock.lock();
      +
      + // Start action after the lock is taken to ensure correct "elapsed time".
      + startAction(true);
      +
                   try {
                       switch (!othervmOverrideReasons.isEmpty() ? ExecMode.OTHERVM : script.getExecMode()) {
                           case AGENTVM:

            stefank Stefan Karlsson
            stefank Stefan Karlsson
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: