Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8258396

SIGILL in jdk.jfr.internal.PlatformRecorder.rotateDisk()

    XMLWordPrintable

Details

    • jfr
    • b03
    • Not verified

    Backports

      Description

        We are seeing intermittent crashes at customer site when JFR is rotating chunks.

        {noformat}
        A fatal error has been detected by the Java Runtime Environment:
        SIGILL (0x4) at pc=0x00007fa665cd4e5e, pid=1, tid=376
        JRE version: OpenJDK Runtime Environment Zulu11.41+23-CA (11.0.8+10) (build 11.0.8+10-LTS)
        Java VM: OpenJDK 64-Bit Server VM Zulu11.41+23-CA (11.0.8+10-LTS, mixed mode, tiered, compressed oops, g1 gc, linux-amd64)
        Problematic frame:
        V [libjvm.so+0x8c9e5e]
        Core dump will be written. Default location: //core
        An error report file with more information is saved as:
        /tmp/hs_err_pid1.log
        {noformat}

        Thanks to @evergizova the culprit was identified to be an erroneous memcpy in JfrStorage::flush_regular() or JfrStorage::flush_large() in combination with musl libc which inserts special traps for cases when memcpy src and dst regions overlap (https://git.2f30.org/fortify-headers/file/include/string.h.html#l39).

        The problem boils down to the fact that for a non-empty buffer the JfrStorage::flush_regular_buffer() will
        reset cur.pos() to the start offset while cur_pos will stay at the
        start offset + N.
        Then memcpy(cur.pos(), cur_pos, used) will have the
        src and dest regions overlapping (given that used > N) and on Alpine
        linux (musl libc) SIGILL will be raised.

        Attachments

          Issue Links

            Activity

              People

                jbachorik Jaroslav Bachorík
                jbachorik Jaroslav Bachorík
                Votes:
                1 Vote for this issue
                Watchers:
                10 Start watching this issue

                Dates

                  Created:
                  Updated:
                  Resolved: