Loading...

Type: Enhancement
Resolution: Fixed
Priority: P4
Fix Version/s: 17
Affects Version/s: 17
Component/s: core-libs
Labels:

Subcomponent:
java.io
Resolved In Build:
b23
Verification:
Verified

ADDITIONAL SYSTEM INFORMATION :
Here is a JMH benchmark which gives an implementation example and shows 30% performance gain for 10MB sized files on Windows. For smaller files there is less improvement but i did not see any degradation on any file size from 100Byte to 10MB:

package benchmarks;

import java.io.IOException;
import java.util.Arrays;

import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.BenchmarkMode;
import org.openjdk.jmh.annotations.Fork;
import org.openjdk.jmh.annotations.Measurement;
import org.openjdk.jmh.annotations.OutputTimeUnit;
import org.openjdk.jmh.annotations.Param;
import org.openjdk.jmh.annotations.Setup;
import org.openjdk.jmh.annotations.State;
import org.openjdk.jmh.annotations.TearDown;
import org.openjdk.jmh.annotations.Warmup;
import org.openjdk.jmh.runner.Runner;
import org.openjdk.jmh.runner.RunnerException;
import org.openjdk.jmh.runner.options.Options;
import org.openjdk.jmh.runner.options.OptionsBuilder;

@BenchmarkMode(org.openjdk.jmh.annotations.Mode.SingleShotTime)
@OutputTimeUnit(java.util.concurrent.TimeUnit.MILLISECONDS)
@State(org.openjdk.jmh.annotations.Scope.Thread)
@Fork(value = 1, jvmArgsAppend = { "-ea" })
@Warmup(batchSize = 1000)
@Measurement(batchSize = 1000)
public class FileRead {
// XXX put in any directory where the files are located.
String dirName = "resources/";
// XXX put in any filenames you like to test.
@Param({ "100b.txt", "1k.txt", "10k.txt", "100k.txt", "1MB.txt", "10MB.txt" })
String fileName;

java.io.File file;
byte[] result;
byte[] expected;

@Setup
public void setup() throws IOException {
file = new java.io.File(dirName + fileName).getAbsoluteFile();
result = null;
expected = java.nio.file.Files.readAllBytes(file.toPath());
}

@TearDown
public void check() {
assert Arrays.equals(expected, result) : "Nothing changed?";
}

public static final int MAX_ARRAY_LENGTH = Integer.MAX_VALUE - 8;

@Benchmark
public void readAllBytesOld() throws IOException {
try (java.io.InputStream input = new java.io.FileInputStream(file)) {
result = input.readAllBytes();
}
}

@Benchmark
public void readAllBytesNew() throws IOException {
try (java.io.InputStream input = new java.io.FileInputStream(file) {

@Override
public byte[] readAllBytes() throws IOException {
long length = this.getChannel().size();
// use jdk.internal.util.ArraysSupport.newLength(int, int, int)?
if (length > MAX_ARRAY_LENGTH)
throw new OutOfMemoryError("File too large for array: " + length);
return readNBytes(this, (int) length);
}

byte[] readNBytes(java.io.InputStream input, int byteLength) throws IOException {
if (byteLength == 0)
return new byte[0];
byte[] byteBuf = new byte[byteLength]; // exact buffer size
int byteCount = 0;
int byteTransferSize = byteBuf.length;
int bytesRead;
while ((bytesRead = input.read(byteBuf, byteCount, byteTransferSize)) >= 0) {
byteCount += bytesRead;
byteTransferSize = byteBuf.length - byteCount;
if (byteTransferSize <= 0) {
break;
}
}
return (byteBuf.length == byteCount) ? byteBuf : Arrays.copyOf(byteBuf, byteCount);
}
}) {
result = input.readAllBytes();
}
}

public static void main(String[] args) throws RunnerException, InterruptedException {
Options opt = new OptionsBuilder().include(FileRead.class.getSimpleName()).shouldFailOnError(true).build();
new Runner(opt).run();
}
}

My results:

Benchmark (fileName) Mode Cnt Score Error Units
ReadByt3.readAllBytesNew 100b.txt ss 34,081 ms/op
ReadByt3.readAllBytesNew 1k.txt ss 35,951 ms/op
ReadByt3.readAllBytesNew 10k.txt ss 40,996 ms/op
ReadByt3.readAllBytesNew 100k.txt ss 66,433 ms/op
ReadByt3.readAllBytesNew 1MB.txt ss 587,246 ms/op
ReadByt3.readAllBytesNew 10MB.txt ss 5361,234 ms/op

ReadByt3.readAllBytesOld 100b.txt ss 35,115 ms/op
ReadByt3.readAllBytesOld 1k.txt ss 35,951 ms/op
ReadByt3.readAllBytesOld 10k.txt ss 45,528 ms/op
ReadByt3.readAllBytesOld 100k.txt ss 125,894 ms/op
ReadByt3.readAllBytesOld 1MB.txt ss 630,972 ms/op
ReadByt3.readAllBytesOld 10MB.txt ss 7538,637 ms/op

A DESCRIPTION OF THE PROBLEM :
InputStream::readAllBytes currently reads all bytes through a series of buffers (https://bugs.openjdk.java.net/browse/JDK-8193832). For local files - where the filesize is known in advance- this could be optimized by reading all bytes at once from OS. Thus avoiding additional array creations and Array copies.
Reading all bytes at once is for example heavily used on products like the eclipse IDE.

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

FileRead.java
3 kB
2021-04-19 12:41

relates to

JDK-8193832 Performance of InputStream.readAllBytes() could be improved

Closed

JDK-8156715 TrustStoreManager does not buffer keystore input stream

Resolved

JDK-8268435 (ch) ChannelInputStream could override readAllBytes

Closed

links to

Commit openjdk/jdk/da4dfde7

Review openjdk/jdk/3845

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates