ADDITIONAL SYSTEM INFORMATION :
Reproduced on OSX using OpenJDK versions 1.8.0_202, 10.0.2, 11.0.2
Also reproduced on Scientific Linux 7.6 version OpenJDK 1.8.0_201
A DESCRIPTION OF THE PROBLEM :
When reading Base64 encoded input of certain lengths with certain buffer sizes,
from a wrapped InputStream ( Base64.getDecoder().wrap(InputStream) ),
2 additional null bytes are added at the end of the input.
This appears to only affect certain input and buffer size combinations, but was encountered in the wild using a buffer size of 4096. Seems like an unhandled edge case.
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
See the source code below.
javac Base64DecodeProblem.java
java Base64DecodeProblem
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
The decoded bytes read from the wrapped stream should match the originally encoded bytes ("12345678"), when a byte buffer of length 7 is used to read decoded output.
ACTUAL -
Two additional null bytes appear after the originally encoded bytes ("12345678\0\0"), when a byte buffer of length 7 is used to read decoded output.
---------- BEGIN SOURCE ----------
import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.io.InputStream;
import java.io.OutputStream;
import java.nio.charset.StandardCharsets;
import java.util.Base64;
/**
* Class to reproduce java.util.Base64 Decoder wrapped stream issue.
*
* @author jmfee@usgs.gov
*/
public class Base64DecodeProblem {
// length 8 is first case where issue appears (buffer size 7)
public static final String RAW = "12345678";
// length 26 is first case where 2 buffer sizes are affected (5, 25)
// public static final String RAW = "12345678901234567890123456";
public static void main(final String[] args) throws Exception {
byte[] raw = RAW.getBytes(StandardCharsets.UTF_8);
System.err.println("input string = \"" + escaped(RAW) + "\"");
System.err.println("length = " + raw.length);
// encode to base64
byte[] encoded = Base64.getEncoder().encode(raw);
System.err.println("encoded = \"" + new String(encoded) + "\"");
// decode using different buffer sizes [1, 8192]
for (int bufferSize = 1; bufferSize < 8192; bufferSize++) {
try (
InputStream in = Base64.getDecoder().wrap(
new ByteArrayInputStream(encoded));
ByteArrayOutputStream baos = new ByteArrayOutputStream();
) {
byte[] buffer = new byte[bufferSize];
int read;
while ((read = in.read(buffer, 0, bufferSize)) != -1) {
baos.write(buffer, 0, read);
}
// compare result, output info if lengths do not match
byte[] decoded = baos.toByteArray();
if (decoded.length != raw.length) {
System.err.println("Buffer size = " + bufferSize);
System.err.println("\tdecoded length = " + decoded.length);
System.err.println("\tdecoded = \"" +
escaped(new String(decoded)) + "\"");
}
}
}
}
public static String escaped(final String str) {
String escaped = str;
escaped = escaped.replace("\n", "\\n");
escaped = escaped.replace("\r", "\\r");
escaped = escaped.replace("\0", "\\0");
return escaped;
}
}
---------- END SOURCE ----------
CUSTOMER SUBMITTED WORKAROUND :
Buffer size of 1 byte never seems to be affected.
FREQUENCY : always
Reproduced on OSX using OpenJDK versions 1.8.0_202, 10.0.2, 11.0.2
Also reproduced on Scientific Linux 7.6 version OpenJDK 1.8.0_201
A DESCRIPTION OF THE PROBLEM :
When reading Base64 encoded input of certain lengths with certain buffer sizes,
from a wrapped InputStream ( Base64.getDecoder().wrap(InputStream) ),
2 additional null bytes are added at the end of the input.
This appears to only affect certain input and buffer size combinations, but was encountered in the wild using a buffer size of 4096. Seems like an unhandled edge case.
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
See the source code below.
javac Base64DecodeProblem.java
java Base64DecodeProblem
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
The decoded bytes read from the wrapped stream should match the originally encoded bytes ("12345678"), when a byte buffer of length 7 is used to read decoded output.
ACTUAL -
Two additional null bytes appear after the originally encoded bytes ("12345678\0\0"), when a byte buffer of length 7 is used to read decoded output.
---------- BEGIN SOURCE ----------
import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.io.InputStream;
import java.io.OutputStream;
import java.nio.charset.StandardCharsets;
import java.util.Base64;
/**
* Class to reproduce java.util.Base64 Decoder wrapped stream issue.
*
* @author jmfee@usgs.gov
*/
public class Base64DecodeProblem {
// length 8 is first case where issue appears (buffer size 7)
public static final String RAW = "12345678";
// length 26 is first case where 2 buffer sizes are affected (5, 25)
// public static final String RAW = "12345678901234567890123456";
public static void main(final String[] args) throws Exception {
byte[] raw = RAW.getBytes(StandardCharsets.UTF_8);
System.err.println("input string = \"" + escaped(RAW) + "\"");
System.err.println("length = " + raw.length);
// encode to base64
byte[] encoded = Base64.getEncoder().encode(raw);
System.err.println("encoded = \"" + new String(encoded) + "\"");
// decode using different buffer sizes [1, 8192]
for (int bufferSize = 1; bufferSize < 8192; bufferSize++) {
try (
InputStream in = Base64.getDecoder().wrap(
new ByteArrayInputStream(encoded));
ByteArrayOutputStream baos = new ByteArrayOutputStream();
) {
byte[] buffer = new byte[bufferSize];
int read;
while ((read = in.read(buffer, 0, bufferSize)) != -1) {
baos.write(buffer, 0, read);
}
// compare result, output info if lengths do not match
byte[] decoded = baos.toByteArray();
if (decoded.length != raw.length) {
System.err.println("Buffer size = " + bufferSize);
System.err.println("\tdecoded length = " + decoded.length);
System.err.println("\tdecoded = \"" +
escaped(new String(decoded)) + "\"");
}
}
}
}
public static String escaped(final String str) {
String escaped = str;
escaped = escaped.replace("\n", "\\n");
escaped = escaped.replace("\r", "\\r");
escaped = escaped.replace("\0", "\\0");
return escaped;
}
}
---------- END SOURCE ----------
CUSTOMER SUBMITTED WORKAROUND :
Buffer size of 1 byte never seems to be affected.
FREQUENCY : always