Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8330773

ProcessImpl closes stdio pipes too early, resulting in dropped data

    XMLWordPrintable

Details

    • generic
    • linux, os_x

    Description

      ADDITIONAL SYSTEM INFORMATION :
      Apple MacBook Air, M1 2020

      MacOS Somona 14.2.1 (23C71)

      openjdk version "23-ea" 2024-09-17
      OpenJDK Runtime Environment (build 23-ea+18-1469)
      OpenJDK 64-Bit Server VM (build 23-ea+18-1469, mixed mode, sharing)


      A DESCRIPTION OF THE PROBLEM :
      Starting a process using java.lang.ProcessBuilder and reading the InputStream representing stdout, in order to collect the complete output of the process.

      Sometimes a few lines at the end of stdout are dropped.

      See the executable test case below.

      The java program is executing a bash script. What the bash script does is to echo "abc" and pipe that to a child process. The child process reads stdin and does some processing of that, and writes to stdout.

      I think that what happens is that java.lang.ProcessImpl is listening for the bash script to terminate, and when that happens it tries to drain the stdout pipe, and then it closes the inputStream that represents stdout of the process. In this case it doesn't wait for the child process started by the script to complete processing. The child process is still running, and when it tries to write to stdout it will get a broken pipe error.

      I have added a sleep in the child process in the bash script to trigger this problem more reliably.

      I then set a breakpoint in ProcessImpl.drainInputStream() to make it wait before draining the input stream, to give the child process time to complete processing. Doing so it works every time.

      https://github.com/openjdk/jdk/blob/6d5699617ff0985104a8bb5f2c9eb8887cb0961e/src/java.base/unix/classes/java/lang/ProcessImpl.java#L588

      I think that the problem is that ProcessImpl.drainInputStream() is using in.available() to check if there is more data available to read. This only catches data that the child process has already written to stdout. But in.available() will also return zero if the child process is still processing data but has not yet written to stdout.

      ProcessImpl.drainInputStream() should wait until the other side has closed the pipe.

      STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
      Run the provided java code, which is executing the provided bash script.

      EXPECTED VERSUS ACTUAL BEHAVIOR :
      EXPECTED -
      stdout: abc
      stderr:

      ACTUAL -
      stdout:
      stderr:

      or, sometimes

      stdout:
      stderr: Broken pipe


      ---------- BEGIN SOURCE ----------
      // test.sh
      #!/bin/bash
      echo abc > >(read line ; sleep 1 ; echo "$line")

      // ProcessTest.java
      package app.truid.test;

      public class ProcessTest {
          public static void main(String[] args) throws Exception {
              Process proc = Runtime.getRuntime().exec(new String[] {"./test.sh"});

              int exitCode = proc.waitFor();
              String stdout = new String(proc.getInputStream().readAllBytes());
              String stderr = new String(proc.getErrorStream().readAllBytes());

              System.out.println("exitCode: " + exitCode);
              System.out.println("stdout: " + stdout);
              System.out.println("stderr: " + stderr);
          }
      }


      ---------- END SOURCE ----------

      FREQUENCY : always


      Attachments

        1. ProcessTest.java
          0.5 kB
        2. test.sh
          0.1 kB

        Activity

          People

            rriggs Roger Riggs
            webbuggrp Webbug Group
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated: