Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-6946101

ShellScaffold.sh based tests can issue "write error: Broken pipe" messages

XMLWordPrintable

    • b04
    • generic
    • generic

      While doing JDK7-B88 baseline testing, I noticed that
      TESTFAIL:com/sun/jdi/GetLocalVariables4Test.sh
      in the SDK/JDK JDI_REGRESSION subsuite timed out with the following output in
      the .jtr file:

      ----------messages:(2/110)*----------
      JDK under test: c:\\work\\local\\baseline_jdks\\1.7.0_b88\\windows-i586
      Timeout signalled after 120 seconds

      #section:shell
      ----------messages:(3/147)----------
      command: shell GetLocalVariables4Test.sh []
      reason: User specified action: run shell GetLocalVariables4Test.sh
      elapsed time (seconds): 123.499
      ----------System.out:(32/1845)----------
      --Compiling first version of /cygdrive/c/work/local/test_results/jdk7/baseline/b88/sdk-jdi-prod-server-fast-mixed.windows-i586/JTwork/classes/com/sun/jdi/aa2596/GetLocalVariables4Targ.java with options: -g
      compiling GetLocalVariables4Targ.java

      --Starting jdb, address=kkkk.2596
      --Starting debuggee with args from @debuggeeVMOptions: -server-fast -Xmixed -XX:DefaultMaxRAMFraction=8 -showversion
      Listening at address: kkkk.2596
      Cmd: c:/work/local/baseline_jdks/1.7.0_b88/windows-i586/bin/java -classpath C:/work/local/test_results/jdk7/baseline/b88/sdk-jdi-prod-server-fast-mixed.windows-i586/JTwork/classes/com/sun/jdi/aa2596 -DHANGINGJAVA-2596_DEB -XX:-ShowMessageBoxOnError -server-fast -Xmixed -XX:DefaultMaxRAMFraction=8 -showversion -Djtreg.classDir=C:/work/local/test_results/jdk7/baseline/b88/sdk-jdi-prod-server-fast-mixed.windows-i586/JTwork/classes/com/sun/jdi/aa2596 -showversion -agentlib:jdwp=transport=dt_shmem,address=kkkk.2596,server=n,suspend=y GetLocalVariables4Targ
      Set uncaught java.lang.Throwable
      Set deferred uncaught java.lang.Throwable
      Initializing jdb ...
      >
      VM Started: No frames on the current call stack

      main[1] Deferring breakpoint GetLocalVariables4Targ:8.
      It will be set after the class is loaded.
      main[1] > Set deferred breakpoint GetLocalVariables4Targ:8
      Howdy!

      Breakpoint hit: "thread=main", GetLocalVariables4Targ.main(), line=8 bci=19
      8 System.out.println("e should be visible"); // @1 breakpoint

      main[1] Method arguments:
      args = instance of java.lang.String[0] (id=339)
      Local variables:
      i = 0
      e = instance of java.lang.ArithmeticException(id=340)
      main[1] e should be visible>
      Goodbye from GetLocalVariables4Targ!
      VM option '-ShowMessageBoxOnError'
      VM option 'DefaultMaxRAMFraction=8'

      The application exited
      ----------System.err:(11/445)----------
      ShellScaffold.sh: Version
      --Sending cmd: stop at GetLocalVariables4Targ:8
      --Sending cmd: run
      java version "1.7.0-ea"
      Java(TM) SE Runtime Environment (build 1.7.0-ea-b88)
      Java HotSpot(TM) Server VM (build 18.0-b02-fastdebug, mixed mode)

      --Sending cmd: locals
      --Sending cmd: cont
      --Sending cmd: quit
      C:/work/shared/mirrors/src_clones/jdk7/tl_jdk_baseline/test/com/sun/jdi/ShellScaffold.sh: line 523: echo: write error: Broken pipe
      result: Error. Execution failed


      test result: Error. Execution failed


      My investigation revealed that GetLocalVariables4Test did indeed timeout
      because it ran at the same as system backups on the test machine in
      question. This is why I'm not listing the complete test path name here
      as the timeout is an invalid failure.

      It turns out that the previous test, GetLocalVariables3Test, has the
      same "write error: Broken pipe" message:

      --Sending cmd: locals
      --Sending cmd: cont
      --Sending cmd: quit
      C:/work/shared/mirrors/src_clones/jdk7/tl_jdk_baseline/test/com/sun/jdi/ShellScaffold.sh: line 523: echo: write error: Broken pipe
      result: Passed. Execution successful


      test result: Passed. Execution successful


      It is this error message and the fact that the test does not fail
      because of it that has caught my attention.

      The code in question from ShellScaffold.sh is:

          508 # Send a cmd to jdb and wait for the jdb prompt to appear.
          509 # We don't want to allow > as a prompt because if the debuggee
          510 # runs for awhile after a command, jdb will show this prompt
          511 # but is not really ready to accept another command for the
          512 # debuggee - ie, a cont in this state will be ignored.
          513 # If it ever becomes necessary to send a jdb command before
          514 # a main[10] form of prompt appears, then this
          515 # code will have to be modified.
          516 cmd()
          517 {
          518 if [ $1 = quit -o -r "$failFile" ] ; then
          519 # if jdb got a cont cmd that caused the debuggee
          520 # to run to completion, jdb can be gone before
          521 # we get here.
          522 echo "--Sending cmd: quit" >& 2
          523 echo quit
          524 # See 6562090. Maybe there is a way that the exit
          525 # can cause jdb to not get the quit.
          526 sleep 5
          527 exit 1
          528 fi

      The "echo quit" on line 523 is getting a SIGPIPE because jdb has been
      given a "cont" command, run to completion, and exited before we can
      send that "quit" command. In the two tests that I've examined in this
      bug, the "cont" command is given after testing is completed and the
      failure to send the "quit" command is benign.

      However, the fact that the "echo quit" command fails without causing
      the test to fail is troublesome. I don't see any uses of 'set -e' or
      'set -u' for detecting unset variable usage.

      It also seems like it would be better to have sort of handshake
      between this "test driver" logic and the jdb process. This would
      allow us to know that the jdb process has finished because we wanted
      it to finish and not because it has run off the end due to a bug.
      In the past, a "missed" breakpoint bug has allowed jdb to run off the
      end. For the longest time, we thought the test was failing due to a
      timeout on an overloaded machine when in reality it was timing out
      because the debuggee VM had gone right past a breakpoint and ran to
      the end of the program. The test driver reported a timeout of the
      last command that it had sent, but the reason for the timeout was
      less than clear.

      We should fix the test driver infrastructure to report something like:

          ERROR: the test program unexpectedly reached the end of execution.

      when we get to the end of the test program without indicating in the
      test program that reaching the end is the next expected action.

      We should also fix the test driver infrastructure to report something like:

          INFO: the test program successfully reached the end of execution.

      when we get to the end of the test program as expected.

      I'm not yet sure how to implement this in ShellScaffold.sh, but I'll
      investigate it.

            ppunegov Pavel Punegov
            dcubed Daniel Daugherty
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved:
              Imported:
              Indexed: