Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8309756

Occasional crashes with pipewire screen capture on Wayland

    XMLWordPrintable

Details

    • b03

    Backports

      Description

        Some tests trigger a not very reproducible crash, and it wasn't clear if it was a bug in pipewire.

        Whilst investigating another issue regarding capture of full screen windows, I found a
        way to make the crash more reproducible.

        On my VirtualBox if I had just one CPU configured I could trigger a crash 80% of the time.
        If I configured 6 (my default) it rarely crashed.

        The reported error was typically something like

        munmap_chunk(): invalid pointer

        or

        double free or corruption (out)

        or

        tcache_thread_shutdown(): unaligned tcache chunk detected

        and the error came right during clean up and it didn't crash
        if I commented out calls to
                fp_pw_thread_loop_stop(pw.loop);
                fp_pw_thread_loop_destroy(pw.loop);

        I turned on pipewire debugging at various levels from 1-5, eg
        export PIPEWIRE_DEBUG=2
        but it wasn't pointing to a smoking gun but it seemed like we perhaps
        had a threading related issue since the number of cores affected reproducibility.

        After a bit of studying the code with this in mind I think I found the problem.
        We are using the thread loop functions (ie those mentioned above) and in such
        a case it is required to use pw_thread_loop_lock() / pw_thread_loop_unlock()
        around calls

        https://pipewire.github.io/pipewire/page_thread_loop.html
        "The lock needs to be held whenever you call any PipeWire function that uses an object associated with this loop."

        That's not super-specific but I noticed that a previous part of the clean up probably needed these calls but was missing them
        Adding then as shown here seems to completely cure the crashes

        --- a/src/java.desktop/unix/native/libawt_xawt/awt/screencast_pipewire.c
        +++ b/src/java.desktop/unix/native/libawt_xawt/awt/screencast_pipewire.c
        @@ -887,8 +887,10 @@ JNIEXPORT jint JNICALL Java_sun_awt_screencast_ScreencastHelper_getRGBPixelsImpl
                     screenProps->captureData = NULL;
                     screenProps->shouldCapture = FALSE;
         
        + fp_pw_thread_loop_lock(pw.loop);
                     fp_pw_stream_set_active(screenProps->data->stream, FALSE);
                     fp_pw_stream_disconnect(screenProps->data->stream);
        + fp_pw_thread_loop_unlock(pw.loop);
                 }
             }

        I also noticed that the function doCleanUp() has
                    if (screenProps->data->stream) {
                        fp_pw_stream_disconnect(screenProps->data->stream);
                        fp_pw_stream_destroy(screenProps->data->stream);
                        screenProps->data->stream = NULL;
                    }

        This
        (1) Again doesn't have any locking but it doesn't seem to be causing a problem here but I think should be added anyway
        (2) Seems to repeat the disconnect call. I'm inclined to leave this alone for now since it doesn't seem to cause harm
        but I don't know that its guaranteed to be idempotent. This needs more analysis probably under a separate bug id.

        Attachments

          Issue Links

            Activity

              People

                prr Philip Race
                prr Philip Race
                Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                  Created:
                  Updated:
                  Resolved: