A read of size N on an InputStream returned by Channels.newInputStream(channel) results in a non-heap memory allocation of size N, if the channel is a socket channel. This is regardless of how big N is. We saw this with an RPC over a Unix-domain socket, but I think it would also apply to a network socket. The header of a received RPC response indicated the payload size, which in our case was about 120M. Then our code allocated a byte[] of that size and issued a read for all 120M. That triggers this code path in Channels.newInputStream:
```
// java.nio.channels.Channels
public static InputStream newInputStream(ReadableByteChannel ch) {
Objects.requireNonNull(ch, "ch");
return sun.nio.ch.Streams.of(ch);
}
// sun.nio.ch.Streams
public static InputStream of(ReadableByteChannel ch) {
if (ch instanceof SocketChannelImpl sc) {
return new SocketInputStream(sc);
} else {
return new ChannelInputStream(ch);
}
}
```
A read on the resultant SocketInputStream ends up doing this:
```
// sun.nio.ch.SocketChannelImpl
private int tryRead(byte[] b, int off, int len) throws IOException {
ByteBuffer dst = Util.getTemporaryDirectBuffer(len);
```
where Util.getTemporaryDirectBuffer(len) essentially ends up calling malloc(len). So if you are reading into a byte[] of size 2G then this code will allocate an additional 2G of non-heap memory for the buffer. I believe it will put the buffer in a cache rather than freeing it after use, too, if system property jdk.nio.maxCachedBufferSize is unset. Something similar happens for writes.
Prior to https://github.com/openjdk/jdk/commit/0312694c46b4fb3455cde2e4d1f8746ad4df8548 this allocation would have been subject to the MaxDirectMemorySize limit, but it no longer is.
I think this behaviour is surprising, and should at least be documented in Channels.newInputStream. A less surprising implementation could avoid allocating a buffer of more than (say) 1M, limiting reads to that size and breaking up writes into chunks of that size.
Meanwhile it's easy enough to work around this issue once you know you have it, either by providing a trivial forwarding implementation of ReadableByteChannel to foil the instanceof check in Streams.of; or by implementing your own InputStream implementation that uses ByteBuffer.wrap, which is effectively what ChannelInputStream does.
```
// java.nio.channels.Channels
public static InputStream newInputStream(ReadableByteChannel ch) {
Objects.requireNonNull(ch, "ch");
return sun.nio.ch.Streams.of(ch);
}
// sun.nio.ch.Streams
public static InputStream of(ReadableByteChannel ch) {
if (ch instanceof SocketChannelImpl sc) {
return new SocketInputStream(sc);
} else {
return new ChannelInputStream(ch);
}
}
```
A read on the resultant SocketInputStream ends up doing this:
```
// sun.nio.ch.SocketChannelImpl
private int tryRead(byte[] b, int off, int len) throws IOException {
ByteBuffer dst = Util.getTemporaryDirectBuffer(len);
```
where Util.getTemporaryDirectBuffer(len) essentially ends up calling malloc(len). So if you are reading into a byte[] of size 2G then this code will allocate an additional 2G of non-heap memory for the buffer. I believe it will put the buffer in a cache rather than freeing it after use, too, if system property jdk.nio.maxCachedBufferSize is unset. Something similar happens for writes.
Prior to https://github.com/openjdk/jdk/commit/0312694c46b4fb3455cde2e4d1f8746ad4df8548 this allocation would have been subject to the MaxDirectMemorySize limit, but it no longer is.
I think this behaviour is surprising, and should at least be documented in Channels.newInputStream. A less surprising implementation could avoid allocating a buffer of more than (say) 1M, limiting reads to that size and breaking up writes into chunks of that size.
Meanwhile it's easy enough to work around this issue once you know you have it, either by providing a trivial forwarding implementation of ReadableByteChannel to foil the instanceof check in Streams.of; or by implementing your own InputStream implementation that uses ByteBuffer.wrap, which is effectively what ChannelInputStream does.