Summary
This CSR refers to the latest iteration of the Foreign Function & Memory API originally targeted for Java 17, with the goal of further consolidating the API, as well as addressing the feedback received so far from developers.
Problem
Real-world use of the Foreign Function & Memory APIs revealed some remaining usability issues, listed below:
There is an asymmetry between the allocation API (
SegmentAllocator) and the dereference API. More specifically, when allocating a segment from an existing Java value/array, aSegmentAllocatoralso accepts theValueLayoutcorresponding to the value/array element, so that necessary alignment constraints and endianness can be applied. But the static dereference methods inMemoryAccessdo not take any layout argument; instead, they optionally accept aByteOrderargument, to perform byte swapping. This asymmetry can lead to subtle mistakes, where a segment is allocated as an array whose element is defined by a given layout, but then the array is accessed in ways that are incompatible with that layout.Some useful data types (
booleanandMemoryAddress) are not supported by memory access var handles.The API makes excessive use of static methods. There is a class
MemoryAccesscontaining several static dereference methods (see above), and theCLinkerclass also contains several static helper functions to e.g. convert a Java string to a C string and back.The
MemoryAddressclass is an entity with its ownResourceScopeobject. The reason for this choice is that a client can e.g. request the base address of a memory segment, and expect the address to keep a reference to the segment scope. But makingMemoryAddressa scoped entity creates confusion in the more common case where an address is returned by a native call, in which case no spatial, nor temporal bounds are available.Memory layouts interacting with the
CLinkerAPI needs to be constructed in a special way; they need to embed special layout attributes which encode additional information which allows the linker runtime to classify the argument correctly, when a new downcall method handle is created. Also, there seems to be some redundancy in how downcall method handles are created: clients have to pass both aFunctionDescriptorand aMethodType, even though, in most cases, the information in theMethodTypecan be inferred from that in theFunctionDescriptor.Calling native functions using downcall method handles can be unsafe: consider the case where a segment is passed by-reference to a downcall method handle. In this case, the segment address is obtained, and then passed to the native call. If the segment is a backed by a shared scope, it would be possible for a client in another thread to close the segment scope concurrently - which might cause the native call to malfunction.
The way in which dependencies between scopes are set up, using
Resource::acquire/releaseis too low-level. There is no way to explicitly set up a temporal dependency between two scopes, w/o resorting to complex uses ofResourceScope::addCloseAction.
Solution
Here we describe the main ideas behind the API changes brought forward in this CSR:
The main change in this iteration of the API is that
ValueLayoutis now always associated with a Java carrier type. For this reason, the API features specialized subclasses, likeValueLayout.OfInt,ValueLayout.OfLongetc. The relationship betweenValueLayoutand a Java carrier simplifies the API in a number of ways:- We can define a set of dereference methods accepting a (specialized) value layout subclass; for instance, instead of
getInt()we can have a method likeget(ValueLayout.OfInt). This allows us to fix the asymmetry between the dereference API and the allocation API. - We can use the carrier information attached to value layouts to decide how to classify parameters to downcall method handles. This effectively removes the need of accepting a (now redundant)
MethodTypeparameter inCLinker::downcallHandle. This also makes the layout attributes machinery redundant, which is in fact removed in this iteration. - We can attach constant var handles to value layouts, which means that obtaining a memory access var handle from a value layout can be far more efficient than before.
- We can define a set of dereference methods accepting a (specialized) value layout subclass; for instance, instead of
Support for
booleanandMemoryAddresshas been added to memory access var handles. These carriers are considered secondary carriers (as opposed to primary carriers, such asbyte,short,char,int,float,long,double). The reason for this distinction is that secondary carriers cannot be copied in bulk to and from memory segments, as each element require some adjustment (e.g. aMemoryAddresshas to be lowered to alongvalue, whilebooleanhas to be normalized to either1or0).The API has been significantly simplified, and some classes have been removed:
- The
MemoryAccessclass is no longer present. Instead, instance dereference methods are present in bothMemorySegmentandMemoryAddress(the latter are restricted, as an address has no bounds). - The
MemoryLayoutsclass is also removed. Value layout constants (JAVA_INTetc.) have been moved insideValueLayout(while other layout constants have been dropped). - Most of the static methods in
CLinker(e.g. to convert from Java strings to C strings and back) have been moved toMemorySegment,MemoryAddressandSegmentAllocator. The platform-dependent layout constants inCLinker(C_INTetc.) have been dropped. It is the role of extraction tools to generate layouts for basic C types that are compatible with a given target platform. - The
CLinker.TypeKindenum has been removed (as it is no longer attached to layouts for classification purposes). - The
VaListclass has been moved to toplevel.
- The
MemoryAddressno longer features aResourceScopeaccessor. That is,MemoryAddressdenotes a raw machine address, and has no notion of spatial and temporal bounds associated with it. Clients can no longer obtain the base address associated with heap segments (e.g.MemoryAddressis for off-heap access only). When parameters are passed by-reference to a downcall method handle, the method handle now takes anAddressableparameter, not aMemoryAddressone. This change allows memory segments to be passed to downcall method handles more directly; the linker runtime will try to keep such arguments alive for the entire duration of a native call. This greatly enhances the safety of theCLinkerAPI, and reduces the number of conversions required in user code.Since
MemoryAddressno longer has aResourceScope, a new entity namedNativeSymbolhas been added, which represents a symbol in a library (either a function or a global variable). ANativeSymbolhas a scope and a name, and is accepted byCLinker::downcallHandlewhen creating downcall method handles. Also,CLinker::upcallStubreturns a new (anonymous)NativeSymbol, which points to the native function generated by the VM which calls back to the target Java method handle provided at creation. The scope attached to a native symbol can be closed at any time, and will cause the symbol to be unloaded. Again,CLinkerwill make sure that a native symbol scope cannot be closed while in the middle of performing a native call.The
ResourceScopeclass contains some simplifications: first, there's no longer a distinction between implicit and explicit scopes. All scopes (but the global scopes) are explicit and can be closed. Some scopes are additionally associated with aCleanerinstance. Secondly, a new methodResourceScope::keepAlive(ResourceScope)has been added to replace the pair ofResourceScope::acquire/releaseas well as theResourceScope.Handleclass.
Specification
A specdiff of the changes as of November 11th, 2021 has been attached to this CSR (v3).
A link of the latest javadoc (as of November 11th, 2021) is included below:
A link of the latest specdiff (as of November 11th, 2021) is included below:
http://cr.openjdk.java.net/~mcimadamore/JEP-419/v3/specdiff_out/overview-summary.html
- csr of
-
JDK-8275063 Implementation of Foreign Function & Memory API (Second incubator)
-
- Resolved
-