Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8309756

Occasional crashes with pipewire screen capture on Wayland



    • b03



        Some tests trigger a not very reproducible crash, and it wasn't clear if it was a bug in pipewire.

        Whilst investigating another issue regarding capture of full screen windows, I found a
        way to make the crash more reproducible.

        On my VirtualBox if I had just one CPU configured I could trigger a crash 80% of the time.
        If I configured 6 (my default) it rarely crashed.

        The reported error was typically something like

        munmap_chunk(): invalid pointer


        double free or corruption (out)


        tcache_thread_shutdown(): unaligned tcache chunk detected

        and the error came right during clean up and it didn't crash
        if I commented out calls to

        I turned on pipewire debugging at various levels from 1-5, eg
        export PIPEWIRE_DEBUG=2
        but it wasn't pointing to a smoking gun but it seemed like we perhaps
        had a threading related issue since the number of cores affected reproducibility.

        After a bit of studying the code with this in mind I think I found the problem.
        We are using the thread loop functions (ie those mentioned above) and in such
        a case it is required to use pw_thread_loop_lock() / pw_thread_loop_unlock()
        around calls

        "The lock needs to be held whenever you call any PipeWire function that uses an object associated with this loop."

        That's not super-specific but I noticed that a previous part of the clean up probably needed these calls but was missing them
        Adding then as shown here seems to completely cure the crashes

        --- a/src/java.desktop/unix/native/libawt_xawt/awt/screencast_pipewire.c
        +++ b/src/java.desktop/unix/native/libawt_xawt/awt/screencast_pipewire.c
        @@ -887,8 +887,10 @@ JNIEXPORT jint JNICALL Java_sun_awt_screencast_ScreencastHelper_getRGBPixelsImpl
                     screenProps->captureData = NULL;
                     screenProps->shouldCapture = FALSE;
        + fp_pw_thread_loop_lock(pw.loop);
                     fp_pw_stream_set_active(screenProps->data->stream, FALSE);
        + fp_pw_thread_loop_unlock(pw.loop);

        I also noticed that the function doCleanUp() has
                    if (screenProps->data->stream) {
                        screenProps->data->stream = NULL;

        (1) Again doesn't have any locking but it doesn't seem to be causing a problem here but I think should be added anyway
        (2) Seems to repeat the disconnect call. I'm inclined to leave this alone for now since it doesn't seem to cause harm
        but I don't know that its guaranteed to be idempotent. This needs more analysis probably under a separate bug id.


          Issue Links



                prr Philip Race
                prr Philip Race
                0 Vote for this issue
                4 Start watching this issue