Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8275064

Implementation of Foreign Function & Memory API (Second incubator)

XMLWordPrintable

    • Icon: CSR CSR
    • Resolution: Approved
    • Icon: P3 P3
    • 18
    • core-libs
    • None
    • source, behavioral
    • low
    • The changes described in this CSR refer to an incubating API. While there are some changes in signature which might break existing sources, the Foreign Function & Memory API is not enabled by default.
    • Java API
    • JDK

      Summary

      This CSR refers to the latest iteration of the Foreign Function & Memory API originally targeted for Java 17, with the goal of further consolidating the API, as well as addressing the feedback received so far from developers.

      Problem

      Real-world use of the Foreign Function & Memory APIs revealed some remaining usability issues, listed below:

      • There is an asymmetry between the allocation API (SegmentAllocator) and the dereference API. More specifically, when allocating a segment from an existing Java value/array, a SegmentAllocator also accepts the ValueLayout corresponding to the value/array element, so that necessary alignment constraints and endianness can be applied. But the static dereference methods in MemoryAccess do not take any layout argument; instead, they optionally accept a ByteOrder argument, to perform byte swapping. This asymmetry can lead to subtle mistakes, where a segment is allocated as an array whose element is defined by a given layout, but then the array is accessed in ways that are incompatible with that layout.

      • Some useful data types (boolean and MemoryAddress) are not supported by memory access var handles.

      • The API makes excessive use of static methods. There is a class MemoryAccess containing several static dereference methods (see above), and the CLinker class also contains several static helper functions to e.g. convert a Java string to a C string and back.

      • The MemoryAddress class is an entity with its own ResourceScope object. The reason for this choice is that a client can e.g. request the base address of a memory segment, and expect the address to keep a reference to the segment scope. But making MemoryAddress a scoped entity creates confusion in the more common case where an address is returned by a native call, in which case no spatial, nor temporal bounds are available.

      • Memory layouts interacting with the CLinker API needs to be constructed in a special way; they need to embed special layout attributes which encode additional information which allows the linker runtime to classify the argument correctly, when a new downcall method handle is created. Also, there seems to be some redundancy in how downcall method handles are created: clients have to pass both a FunctionDescriptor and a MethodType, even though, in most cases, the information in the MethodType can be inferred from that in the FunctionDescriptor.

      • Calling native functions using downcall method handles can be unsafe: consider the case where a segment is passed by-reference to a downcall method handle. In this case, the segment address is obtained, and then passed to the native call. If the segment is a backed by a shared scope, it would be possible for a client in another thread to close the segment scope concurrently - which might cause the native call to malfunction.

      • The way in which dependencies between scopes are set up, using Resource::acquire/release is too low-level. There is no way to explicitly set up a temporal dependency between two scopes, w/o resorting to complex uses of ResourceScope::addCloseAction.

      Solution

      Here we describe the main ideas behind the API changes brought forward in this CSR:

      • The main change in this iteration of the API is that ValueLayout is now always associated with a Java carrier type. For this reason, the API features specialized subclasses, like ValueLayout.OfInt, ValueLayout.OfLong etc. The relationship between ValueLayout and a Java carrier simplifies the API in a number of ways:

        • We can define a set of dereference methods accepting a (specialized) value layout subclass; for instance, instead of getInt() we can have a method like get(ValueLayout.OfInt). This allows us to fix the asymmetry between the dereference API and the allocation API.
        • We can use the carrier information attached to value layouts to decide how to classify parameters to downcall method handles. This effectively removes the need of accepting a (now redundant) MethodType parameter in CLinker::downcallHandle. This also makes the layout attributes machinery redundant, which is in fact removed in this iteration.
        • We can attach constant var handles to value layouts, which means that obtaining a memory access var handle from a value layout can be far more efficient than before.
      • Support for boolean and MemoryAddress has been added to memory access var handles. These carriers are considered secondary carriers (as opposed to primary carriers, such as byte, short, char, int, float, long, double). The reason for this distinction is that secondary carriers cannot be copied in bulk to and from memory segments, as each element require some adjustment (e.g. a MemoryAddress has to be lowered to a long value, while boolean has to be normalized to either 1 or 0).

      • The API has been significantly simplified, and some classes have been removed:

        • The MemoryAccess class is no longer present. Instead, instance dereference methods are present in both MemorySegment and MemoryAddress (the latter are restricted, as an address has no bounds).
        • The MemoryLayouts class is also removed. Value layout constants (JAVA_INT etc.) have been moved inside ValueLayout (while other layout constants have been dropped).
        • Most of the static methods in CLinker (e.g. to convert from Java strings to C strings and back) have been moved to MemorySegment, MemoryAddress and SegmentAllocator. The platform-dependent layout constants in CLinker (C_INT etc.) have been dropped. It is the role of extraction tools to generate layouts for basic C types that are compatible with a given target platform.
        • The CLinker.TypeKind enum has been removed (as it is no longer attached to layouts for classification purposes).
        • The VaList class has been moved to toplevel.
      • MemoryAddress no longer features a ResourceScope accessor. That is, MemoryAddress denotes a raw machine address, and has no notion of spatial and temporal bounds associated with it. Clients can no longer obtain the base address associated with heap segments (e.g. MemoryAddress is for off-heap access only). When parameters are passed by-reference to a downcall method handle, the method handle now takes an Addressable parameter, not a MemoryAddress one. This change allows memory segments to be passed to downcall method handles more directly; the linker runtime will try to keep such arguments alive for the entire duration of a native call. This greatly enhances the safety of the CLinker API, and reduces the number of conversions required in user code.

      • Since MemoryAddress no longer has a ResourceScope, a new entity named NativeSymbol has been added, which represents a symbol in a library (either a function or a global variable). A NativeSymbol has a scope and a name, and is accepted by CLinker::downcallHandle when creating downcall method handles. Also, CLinker::upcallStub returns a new (anonymous) NativeSymbol, which points to the native function generated by the VM which calls back to the target Java method handle provided at creation. The scope attached to a native symbol can be closed at any time, and will cause the symbol to be unloaded. Again, CLinker will make sure that a native symbol scope cannot be closed while in the middle of performing a native call.

      • The ResourceScope class contains some simplifications: first, there's no longer a distinction between implicit and explicit scopes. All scopes (but the global scopes) are explicit and can be closed. Some scopes are additionally associated with a Cleaner instance. Secondly, a new method ResourceScope::keepAlive(ResourceScope) has been added to replace the pair of ResourceScope::acquire/release as well as the ResourceScope.Handle class.

      Specification

      A specdiff of the changes as of November 11th, 2021 has been attached to this CSR (v3).

      A link of the latest javadoc (as of November 11th, 2021) is included below:

      http://cr.openjdk.java.net/~mcimadamore/JEP-419/v3/javadoc/jdk/incubator/foreign/package-summary.html

      A link of the latest specdiff (as of November 11th, 2021) is included below:

      http://cr.openjdk.java.net/~mcimadamore/JEP-419/v3/specdiff_out/overview-summary.html

        1. specdiff_v4.zip
          573 kB
        2. specdiff_v3.zip
          553 kB
        3. specdiff_10_11_2020.zip
          549 kB

            mcimadamore Maurizio Cimadamore
            mcimadamore Maurizio Cimadamore
            Jorn Vernee
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: