-
Enhancement
-
Resolution: Unresolved
-
P4
-
None
-
None
-
generic
-
linux, solaris, windows, aix
While implementing Lucene's code using project Panama a very long standing issue (see also JDK-4032069, which was closed won't fix long ago) popped up again. I was discussing with Marurizio already and he asked me to open a bug report for wider discussion. I can convert this to a JEP - if needed.
Problem: Interoperability between Java code around packages java.io/java.nio and MethodHandles created as foreign downcall to libc do not work as the file descriptor required to call to those methods is unreachable by Java code.
The same problem existed for process ids, but this was solved in Java 9, e.g. you can call ProcessHandle.current().pid() to get the process id of the current Java process.
If you have an InputStream or FileChannel it is impossible to get the file descriptor behind it (neither on Linux nor Windows), as the FileDescriptor class has no useful Methods. Like ProcessHandle, I propose to add a method fd() or handle() to the FileDescriptor class, so you can get hold of the handle and pass it to foreign methods. In addition to this, it might also be good to allow to get the FileDescriptor from a FileChannel, because when doing mmap, you use a FileChannel to do that, but FileChannel has no getFD() method like FileInputStream and others (note: this problem also affects modern java.nio.Files) APIs, so a way to getthe FileDescriptor from there would also be good.
JDK-4032069 says it would be trivial to implement your own InputStream on top of open/read/seek - with libc, but why should a developer need to do this. It might be simple for an InputStream, but it gets complex if you just want to set some flags (ioctl) on sockets or using fadvise of madvise, but you are required to reimplement the whole socket API or FileChannel API. Just because you want to do some minor modifications on the file descriptor!
In Apache Lucene we would like to call madvise() on a FileChannel's FileDescriptor to have more predictable mapping behaviour when we call FileChannel#map afterwards. Also when writing to a FileOutputStream / Files#newOutputStream() we would like to call fadvise() to disable caching. We would also like to use the new Linux 5.4 constants, so a generic Java-based madvise/fadvise method (which would be another feature request) is not enough.
Note about security: There's no security risk in making the integer/long value of a low level File Descriptor available, because if you want to interact with it, you would need JNI or project Panama which adds all security on top. The integer value on its own is meaningless without access to native apis. The discussion is comparable to ProcessHandle#pid() back in Java 9 days.
What needs to be changed:
- Add "long handle()" or "long fd()" to the FileChannel interface.
- Add FileChannel#getFD()
- Think of how to support to get the FileChannel from java.nio.file APIs. At moment this is an open discussion. A solution might be to let the InputStreams/Channels/OutputStreams implement some interface java.nio.files.HasFileDescriptor that declares "FileDescriptor getFD()" method. FileInputStream/OutputStream/FileChannel/RandomAccessFile could impement that interface too. This would allow code to get the FileDescriptor using an instanceof check.
Workarounds: Implement your own implementation of FileInputStream/FileOutputStream/FileChannel/memory mapping/socket IO and open files/sockets using native code.
Summary: Now that Project panama is out of the door, this looks like a very important extension to the core library, because at moment there is no way to interact between downcall MethodHandles into the libc / kernel.dll and Java's native IO functionality. The simple and non-risky addition to get a native file descriptor as "long" from open files would be a nice and very important addition for projects like Apache Lucene to make real use of project Panama.
Problem: Interoperability between Java code around packages java.io/java.nio and MethodHandles created as foreign downcall to libc do not work as the file descriptor required to call to those methods is unreachable by Java code.
The same problem existed for process ids, but this was solved in Java 9, e.g. you can call ProcessHandle.current().pid() to get the process id of the current Java process.
If you have an InputStream or FileChannel it is impossible to get the file descriptor behind it (neither on Linux nor Windows), as the FileDescriptor class has no useful Methods. Like ProcessHandle, I propose to add a method fd() or handle() to the FileDescriptor class, so you can get hold of the handle and pass it to foreign methods. In addition to this, it might also be good to allow to get the FileDescriptor from a FileChannel, because when doing mmap, you use a FileChannel to do that, but FileChannel has no getFD() method like FileInputStream and others (note: this problem also affects modern java.nio.Files) APIs, so a way to getthe FileDescriptor from there would also be good.
In Apache Lucene we would like to call madvise() on a FileChannel's FileDescriptor to have more predictable mapping behaviour when we call FileChannel#map afterwards. Also when writing to a FileOutputStream / Files#newOutputStream() we would like to call fadvise() to disable caching. We would also like to use the new Linux 5.4 constants, so a generic Java-based madvise/fadvise method (which would be another feature request) is not enough.
Note about security: There's no security risk in making the integer/long value of a low level File Descriptor available, because if you want to interact with it, you would need JNI or project Panama which adds all security on top. The integer value on its own is meaningless without access to native apis. The discussion is comparable to ProcessHandle#pid() back in Java 9 days.
What needs to be changed:
- Add "long handle()" or "long fd()" to the FileChannel interface.
- Add FileChannel#getFD()
- Think of how to support to get the FileChannel from java.nio.file APIs. At moment this is an open discussion. A solution might be to let the InputStreams/Channels/OutputStreams implement some interface java.nio.files.HasFileDescriptor that declares "FileDescriptor getFD()" method. FileInputStream/OutputStream/FileChannel/RandomAccessFile could impement that interface too. This would allow code to get the FileDescriptor using an instanceof check.
Workarounds: Implement your own implementation of FileInputStream/FileOutputStream/FileChannel/memory mapping/socket IO and open files/sockets using native code.
Summary: Now that Project panama is out of the door, this looks like a very important extension to the core library, because at moment there is no way to interact between downcall MethodHandles into the libc / kernel.dll and Java's native IO functionality. The simple and non-risky addition to get a native file descriptor as "long" from open files would be a nice and very important addition for projects like Apache Lucene to make real use of project Panama.
- relates to
-
JDK-8283073 Consider exposing the UNIX inode number and Windows File ID as attributes
-
- Open
-