-
Bug
-
Resolution: Fixed
-
P4
-
8, 9
-
b124
-
generic
-
generic
-
Verified
FULL PRODUCT VERSION :
A DESCRIPTION OF THE PROBLEM :
The skip() method of the InputStream returned by Process.getInputStream() does not always work correctly. It can skip fewer bytes than the value requested, and crucially, fewer bytes than its return value, which is supposed to give the exact value. This results in a corrupt reading of data from the stream.
I've found the bug to exist on at least Java 7 and Java 8 on Windows 2003 and Windows 7. (Those are the only platforms I have available to test.)
It's possible there are other types of stream affected too. I have not been able to track down the exact source of the bug. I believe the problem is happening somewhere in native code, and I am not set up to debug that. I have included everything I know about reproducing the problem in the test case.
REPRODUCIBILITY :
This bug can be reproduced always.
---------- BEGIN SOURCE ----------
import java.io.*;
/**
* Demonstration of a Java bug that occurs with data read over pipes.
* When the skip() method is used on the pipe input stream, it seems to skip too few
* bytes (in particular, fewer than its return value), which results in the wrong data
* being subsequently read.
*/
public class PipeCorruption {
public static void main(String[] args) throws Throwable {
final File file = new File("pipedata.tmp");
final int BLOCK_SIZE = 10000;
// Generate a test file containing a series of data blocks of length BLOCK_SIZE,
// each with a one-byte header. For example's sake, the "header" is a capital letter,
// and the "data" is the lowercase version of that letter repeated BLOCK_SIZE times:
try (OutputStream out = new BufferedOutputStream(new FileOutputStream(file))) {
for (int header = 'A'; header <= 'Z'; header++) {
out.write(header);
int data = Character.toLowerCase(header);
for (int i = 0; i < BLOCK_SIZE; i++) {
out.write(data);
}
}
}
// We ask some external program to echo the file, so we can read its output over a pipe:
ProcessBuilder pb = new ProcessBuilder();
pb.command("cmd", "/c", "type", file.toString());
//pb.command("php", "-r", "readFile('" + file + "');");
pb.redirectError(ProcessBuilder.Redirect.INHERIT);
Process process = pb.start();
InputStream in = process.getInputStream();
// Note: Process.getInputStream() actually returns a BufferedInputStream whose
// skip() method works perfectly, partially obscuring the bug. Only when the
// BufferedInputStream's buffer is drained, and it passes the skip() call to
// the underlying native stream, does the problem occur.
// Which external program is used can affect the way in which the data becomes
// corrupt, which I think is because different programs output at a different
// rate, which affects when the BufferedInputStream becomes drained, although
// the problem does seem to be reproducible with any program. I originally
// encountered the bug while reading video data being generated by FFmpeg.
if (false) {
// Optional: bypass the BufferedInputStream to retrieve the underlying stream,
// and the bug becomes reliably reproduced even with small block sizes.
if (in instanceof BufferedInputStream) {
java.lang.reflect.Field f = FilterInputStream.class.getDeclaredField("in");
f.setAccessible(true);
in = (InputStream)f.get(in);
}
}
if (false) {
// Optional: Dump the entire data into a temp array first, then read it all
// from there, and the bug disappears. This proves that the data is written
// to the file correctly, and that the external program outputs it correctly,
// and that the test case logic is correct.
ByteArrayOutputStream tmp = new ByteArrayOutputStream();
for (int i; (i = in.read()) != -1; tmp.write(i));
in = new ByteArrayInputStream(tmp.toByteArray());
}
// Now iterate all the data blocks from the file:
boolean error = false;
int expectedHeader = 'A';
for (;;) {
// Read the header byte
int header = in.read();
if (header == -1) break; // done all blocks
System.out.println((char)header);
// The header byte should be simple 'A' to 'Z'.
// When the bug hits, we will get lowercase letters instead.
if (header != expectedHeader) error = true;
expectedHeader++;
// Handle the data bytes:
// if we SKIP the bytes, then subsequent reads become corrupt (out-of-sync);
// if we READ the bytes into a dummy array, then the bug completely disappears;
int remaining = BLOCK_SIZE;
do {
int n;
if (true) {
n = (int)in.skip(remaining);
} else {
n = in.read(new byte[remaining]);
}
remaining -= n;
} while (remaining != 0);
}
System.out.println(error ? "[FAIL]" : "[OK]");
file.delete();
}
}
---------- END SOURCE ----------
CUSTOMER SUBMITTED WORKAROUND :
Do not use the skip method. Instead, read bytes into a dummy byte array.
A DESCRIPTION OF THE PROBLEM :
The skip() method of the InputStream returned by Process.getInputStream() does not always work correctly. It can skip fewer bytes than the value requested, and crucially, fewer bytes than its return value, which is supposed to give the exact value. This results in a corrupt reading of data from the stream.
I've found the bug to exist on at least Java 7 and Java 8 on Windows 2003 and Windows 7. (Those are the only platforms I have available to test.)
It's possible there are other types of stream affected too. I have not been able to track down the exact source of the bug. I believe the problem is happening somewhere in native code, and I am not set up to debug that. I have included everything I know about reproducing the problem in the test case.
REPRODUCIBILITY :
This bug can be reproduced always.
---------- BEGIN SOURCE ----------
import java.io.*;
/**
* Demonstration of a Java bug that occurs with data read over pipes.
* When the skip() method is used on the pipe input stream, it seems to skip too few
* bytes (in particular, fewer than its return value), which results in the wrong data
* being subsequently read.
*/
public class PipeCorruption {
public static void main(String[] args) throws Throwable {
final File file = new File("pipedata.tmp");
final int BLOCK_SIZE = 10000;
// Generate a test file containing a series of data blocks of length BLOCK_SIZE,
// each with a one-byte header. For example's sake, the "header" is a capital letter,
// and the "data" is the lowercase version of that letter repeated BLOCK_SIZE times:
try (OutputStream out = new BufferedOutputStream(new FileOutputStream(file))) {
for (int header = 'A'; header <= 'Z'; header++) {
out.write(header);
int data = Character.toLowerCase(header);
for (int i = 0; i < BLOCK_SIZE; i++) {
out.write(data);
}
}
}
// We ask some external program to echo the file, so we can read its output over a pipe:
ProcessBuilder pb = new ProcessBuilder();
pb.command("cmd", "/c", "type", file.toString());
//pb.command("php", "-r", "readFile('" + file + "');");
pb.redirectError(ProcessBuilder.Redirect.INHERIT);
Process process = pb.start();
InputStream in = process.getInputStream();
// Note: Process.getInputStream() actually returns a BufferedInputStream whose
// skip() method works perfectly, partially obscuring the bug. Only when the
// BufferedInputStream's buffer is drained, and it passes the skip() call to
// the underlying native stream, does the problem occur.
// Which external program is used can affect the way in which the data becomes
// corrupt, which I think is because different programs output at a different
// rate, which affects when the BufferedInputStream becomes drained, although
// the problem does seem to be reproducible with any program. I originally
// encountered the bug while reading video data being generated by FFmpeg.
if (false) {
// Optional: bypass the BufferedInputStream to retrieve the underlying stream,
// and the bug becomes reliably reproduced even with small block sizes.
if (in instanceof BufferedInputStream) {
java.lang.reflect.Field f = FilterInputStream.class.getDeclaredField("in");
f.setAccessible(true);
in = (InputStream)f.get(in);
}
}
if (false) {
// Optional: Dump the entire data into a temp array first, then read it all
// from there, and the bug disappears. This proves that the data is written
// to the file correctly, and that the external program outputs it correctly,
// and that the test case logic is correct.
ByteArrayOutputStream tmp = new ByteArrayOutputStream();
for (int i; (i = in.read()) != -1; tmp.write(i));
in = new ByteArrayInputStream(tmp.toByteArray());
}
// Now iterate all the data blocks from the file:
boolean error = false;
int expectedHeader = 'A';
for (;;) {
// Read the header byte
int header = in.read();
if (header == -1) break; // done all blocks
System.out.println((char)header);
// The header byte should be simple 'A' to 'Z'.
// When the bug hits, we will get lowercase letters instead.
if (header != expectedHeader) error = true;
expectedHeader++;
// Handle the data bytes:
// if we SKIP the bytes, then subsequent reads become corrupt (out-of-sync);
// if we READ the bytes into a dummy array, then the bug completely disappears;
int remaining = BLOCK_SIZE;
do {
int n;
if (true) {
n = (int)in.skip(remaining);
} else {
n = in.read(new byte[remaining]);
}
remaining -= n;
} while (remaining != 0);
}
System.out.println(error ? "[FAIL]" : "[OK]");
file.delete();
}
}
---------- END SOURCE ----------
CUSTOMER SUBMITTED WORKAROUND :
Do not use the skip method. Instead, read bytes into a dummy byte array.