Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8188872

runtime/ErrorHandling/TimeoutInErrorHandlingTest.java fails intermittently

    XMLWordPrintable

Details

    • b24
    • x86_64, sparc_64
    • solaris_11

    Backports

      Description

        I have seen the following failure for this test:

        ----------System.out:(3/958)----------
        Command line: [/work/shared/bug_hunt/8167108/SMR_prototype_10/build/solaris-x86_64-normal-server-slowdebug/images/jdk/bin/java -cp /work/shared/bug_hunt/8167108/SMR_prototype_10/JTwork_slowdebug/hotspot_jtreg_0/classes/41/runtime/ErrorHandling/TimeoutInErrorHandlingTest.d:/work/shared/bug_hunt/8167108/SMR_prototype_10/open/test/hotspot/jtreg/runtime/ErrorHandling:/work/shared/bug_hunt/8167108/SMR_prototype_10/JTwork_slowdebug/hotspot_jtreg_0/classes/41/test/lib:/work/shared/bug_hunt/8167108/SMR_prototype_10/open/test/lib:/java/re/jtreg/4.2/promoted/latest/binaries/jtreg/lib/javatest.jar:/java/re/jtreg/4.2/promoted/latest/binaries/jtreg/lib/jtreg.jar -XX:+UnlockDiagnosticVMOptions -Xmx100M -XX:ErrorHandlerTest=14 -XX:+TestUnresponsiveErrorHandler -XX:ErrorLogTimeout=16 -XX:-CreateCoredumpOnCrash -version ]
        Found hs_err file. Scanning...
        Found: [timeout occurred during error reporting in step "test unresponsive error reporting step"] after 4 s..
        ----------System.err:(13/916)----------
        java.lang.RuntimeException: hs-err file incomplete (first missing pattern: 1)
                at TimeoutInErrorHandlingTest.main(TimeoutInErrorHandlingTest.java:127)
                at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
                at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
                at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
                at java.base/java.lang.reflect.Method.invoke(Method.java:564)
                at com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:115)
                at java.base/java.lang.Thread.run(Thread.java:844)

        JavaTest Message: Test threw exception: java.lang.RuntimeException: hs-err file incomplete (first missing pattern: 1)
        JavaTest Message: shutting down test

        STATUS:Failed.`main' threw exception: java.lang.RuntimeException: hs-err file incomplete (first missing pattern: 1)


        The test failed because the hs_err_pid file was empty:

        $ ls -l JTwork_slowdebug/hotspot_jtreg_0/runtime/ErrorHandling/TimeoutInErrorHandlingTest
        total 586381
        -rw------- 1 dcubed green 300308990 Oct 5 12:15 core
        -rw-r--r-- 1 dcubed green 0 Oct 5 12:14 hs_err_pid70.log


        The test was invoked with '-XX:-CreateCoredumpOnCrash',
        but there is a core file. Notice that the timestamp on the
        core file is older than the timestamp on the hs_err_pid file.

        For this particular failure, core file creation took too long
        and hs_err_pid output did not happen.

        So I see two problems here:

        1) The '-XX:-CreateCoredumpOnCrash' is ignored for some reason.
        2) The hs_err_pid may not get ANY output and we have to decide
             what that means in the context of this test.

        Update: Now that I'm getting into analyzing the bug, I can see errors
        in the above comments. These lines:

        Found hs_err file. Scanning...
        Found: [timeout occurred during error reporting in step "test unresponsive error reporting step"] after 4 s..

        show that the hs_err file was found and that the first expected output
        line was found. I have no explanation for the empty hs_err_pid file:

        -rw-r--r-- 1 dcubed green 0 Oct 5 12:14 hs_err_pid70.log

        Also this line:

        > Notice that the timestamp on the
        > core file is older than the timestamp on the hs_err_pid file.

        is backwards. The core file is newer than the hs_err_pid file.
        Since the core file is generated after the WatcherThread
        detects that hs_err generation is taking too long, it makes
        sense that the core file is newer than the hs_err_pid.

        Attachments

          Issue Links

            Activity

              People

                dcubed Daniel Daugherty
                dcubed Daniel Daugherty
                Votes:
                0 Vote for this issue
                Watchers:
                8 Start watching this issue

                Dates

                  Created:
                  Updated:
                  Resolved: