Summary
This CSR refers to the latest iteration of the Foreign Function & Memory API originally targeted for Java 17, with the goal of further consolidating the API, as well as addressing the feedback received so far from developers.
Problem
Real-world use of the Foreign Function & Memory APIs revealed some remaining usability issues, listed below:
There is an asymmetry between the allocation API (
SegmentAllocator
) and the dereference API. More specifically, when allocating a segment from an existing Java value/array, aSegmentAllocator
also accepts theValueLayout
corresponding to the value/array element, so that necessary alignment constraints and endianness can be applied. But the static dereference methods inMemoryAccess
do not take any layout argument; instead, they optionally accept aByteOrder
argument, to perform byte swapping. This asymmetry can lead to subtle mistakes, where a segment is allocated as an array whose element is defined by a given layout, but then the array is accessed in ways that are incompatible with that layout.Some useful data types (
boolean
andMemoryAddress
) are not supported by memory access var handles.The API makes excessive use of static methods. There is a class
MemoryAccess
containing several static dereference methods (see above), and theCLinker
class also contains several static helper functions to e.g. convert a Java string to a C string and back.The
MemoryAddress
class is an entity with its ownResourceScope
object. The reason for this choice is that a client can e.g. request the base address of a memory segment, and expect the address to keep a reference to the segment scope. But makingMemoryAddress
a scoped entity creates confusion in the more common case where an address is returned by a native call, in which case no spatial, nor temporal bounds are available.Memory layouts interacting with the
CLinker
API needs to be constructed in a special way; they need to embed special layout attributes which encode additional information which allows the linker runtime to classify the argument correctly, when a new downcall method handle is created. Also, there seems to be some redundancy in how downcall method handles are created: clients have to pass both aFunctionDescriptor
and aMethodType
, even though, in most cases, the information in theMethodType
can be inferred from that in theFunctionDescriptor
.Calling native functions using downcall method handles can be unsafe: consider the case where a segment is passed by-reference to a downcall method handle. In this case, the segment address is obtained, and then passed to the native call. If the segment is a backed by a shared scope, it would be possible for a client in another thread to close the segment scope concurrently - which might cause the native call to malfunction.
The way in which dependencies between scopes are set up, using
Resource::acquire/release
is too low-level. There is no way to explicitly set up a temporal dependency between two scopes, w/o resorting to complex uses ofResourceScope::addCloseAction
.
Solution
Here we describe the main ideas behind the API changes brought forward in this CSR:
The main change in this iteration of the API is that
ValueLayout
is now always associated with a Java carrier type. For this reason, the API features specialized subclasses, likeValueLayout.OfInt
,ValueLayout.OfLong
etc. The relationship betweenValueLayout
and a Java carrier simplifies the API in a number of ways:- We can define a set of dereference methods accepting a (specialized) value layout subclass; for instance, instead of
getInt()
we can have a method likeget(ValueLayout.OfInt)
. This allows us to fix the asymmetry between the dereference API and the allocation API. - We can use the carrier information attached to value layouts to decide how to classify parameters to downcall method handles. This effectively removes the need of accepting a (now redundant)
MethodType
parameter inCLinker::downcallHandle
. This also makes the layout attributes machinery redundant, which is in fact removed in this iteration. - We can attach constant var handles to value layouts, which means that obtaining a memory access var handle from a value layout can be far more efficient than before.
- We can define a set of dereference methods accepting a (specialized) value layout subclass; for instance, instead of
Support for
boolean
andMemoryAddress
has been added to memory access var handles. These carriers are considered secondary carriers (as opposed to primary carriers, such asbyte
,short
,char
,int
,float
,long
,double
). The reason for this distinction is that secondary carriers cannot be copied in bulk to and from memory segments, as each element require some adjustment (e.g. aMemoryAddress
has to be lowered to along
value, whileboolean
has to be normalized to either1
or0
).The API has been significantly simplified, and some classes have been removed:
- The
MemoryAccess
class is no longer present. Instead, instance dereference methods are present in bothMemorySegment
andMemoryAddress
(the latter are restricted, as an address has no bounds). - The
MemoryLayouts
class is also removed. Value layout constants (JAVA_INT
etc.) have been moved insideValueLayout
(while other layout constants have been dropped). - Most of the static methods in
CLinker
(e.g. to convert from Java strings to C strings and back) have been moved toMemorySegment
,MemoryAddress
andSegmentAllocator
. The platform-dependent layout constants inCLinker
(C_INT
etc.) have been dropped. It is the role of extraction tools to generate layouts for basic C types that are compatible with a given target platform. - The
CLinker.TypeKind
enum has been removed (as it is no longer attached to layouts for classification purposes). - The
VaList
class has been moved to toplevel.
- The
MemoryAddress
no longer features aResourceScope
accessor. That is,MemoryAddress
denotes a raw machine address, and has no notion of spatial and temporal bounds associated with it. Clients can no longer obtain the base address associated with heap segments (e.g.MemoryAddress
is for off-heap access only). When parameters are passed by-reference to a downcall method handle, the method handle now takes anAddressable
parameter, not aMemoryAddress
one. This change allows memory segments to be passed to downcall method handles more directly; the linker runtime will try to keep such arguments alive for the entire duration of a native call. This greatly enhances the safety of theCLinker
API, and reduces the number of conversions required in user code.Since
MemoryAddress
no longer has aResourceScope
, a new entity namedNativeSymbol
has been added, which represents a symbol in a library (either a function or a global variable). ANativeSymbol
has a scope and a name, and is accepted byCLinker::downcallHandle
when creating downcall method handles. Also,CLinker::upcallStub
returns a new (anonymous)NativeSymbol
, which points to the native function generated by the VM which calls back to the target Java method handle provided at creation. The scope attached to a native symbol can be closed at any time, and will cause the symbol to be unloaded. Again,CLinker
will make sure that a native symbol scope cannot be closed while in the middle of performing a native call.The
ResourceScope
class contains some simplifications: first, there's no longer a distinction between implicit and explicit scopes. All scopes (but the global scopes) are explicit and can be closed. Some scopes are additionally associated with aCleaner
instance. Secondly, a new methodResourceScope::keepAlive(ResourceScope)
has been added to replace the pair ofResourceScope::acquire/release
as well as theResourceScope.Handle
class.
Specification
A specdiff of the changes as of November 11th, 2021 has been attached to this CSR (v3).
A link of the latest javadoc (as of November 11th, 2021) is included below:
A link of the latest specdiff (as of November 11th, 2021) is included below:
http://cr.openjdk.java.net/~mcimadamore/JEP-419/v3/specdiff_out/overview-summary.html
- csr of
-
JDK-8275063 Implementation of Foreign Function & Memory API (Second incubator)
-
- Resolved
-