Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8218559

JEP 353: Reimplement the Legacy Socket API

XMLWordPrintable

    • Alan Bateman
    • Feature
    • Open
    • JDK
    • net dash dev at openjdk dot java dot net
    • S
    • 353

      Summary

      Replace the underlying implementation used by the java.net.Socket and java.net.ServerSocket APIs with a simpler and more modern implementation that is easy to maintain and debug. The new implementation will be easy to adapt to work with user-mode threads, a.k.a. fibers, currently being explored in Project Loom.

      Motivation

      The java.net.Socket and java.net.ServerSocket APIs, and their underlying implementations, date back to JDK 1.0. The implementation is a mix of legacy Java and C code that is painful to maintain and debug. The implementation uses the thread stack as the I/O buffer, an approach that has required increasing the default thread stack size on several occasions. The implementation uses a native data structure to support asynchronous close, a source of subtle reliability and porting issues over the years. The implementation also has several concurrency issues that require an overhaul to address properly. In the context of a future world of fibers that park instead of blocking threads in native methods, the current implementation is not fit for purpose.

      Description

      The java.net.Socket and java.net.ServerSocket APIs delegate all socket operations to a java.net.SocketImpl, a Service Provider Interface (SPI) mechanism that has existed since JDK 1.0. The built-in implementation is termed the “plain” implementation, implemented by the non-public PlainSocketImpl with supporting classes SocketInputStream and SocketOutputStream. PlainSocketImpl is extended by two other JDK-internal implementations that support connections through SOCKS and HTTP proxy servers. By default, a Socket and ServerSocket is created (sometimes lazily) with a SOCKS based SocketImpl. In the case of ServerSocket, the use of the SOCKS implementation is an oddity that dates back to experimental (and since removed) support for proxying server connections in JDK 1.4.

      The new implementation, NioSocketImpl, is a drop-in replacement for PlainSocketImpl. It is developed to be easy to maintain and debug. It shares the same JDK-internal infrastructure as the New I/O (NIO) implementation so it doesn't need its own native code. It integrates with the existing buffer cache mechanism so that it doesn’t need to use the thread stack for I/O. It uses java.util.concurrent locks rather than synchronized methods so that it can play well with fibers in the future. In JDK 11, the NIO SocketChannel and the other SelectableChannel implementations were mostly re-implemented with the same goal in mind.

      The following are a few points about the new implementation:

      • SocketImpl is a legacy SPI mechanism and is very under-specified. The new implementation attempts to be compatible with the old implementation by emulating unspecified behavior and exceptions where applicable. The Risks and Assumptions section below details the behavior differences between the old and new implementations.

      • Socket operations using timeouts (connect, accept, read) are implemented by changing the socket to non-blocking mode and polling the socket.

      • The java.lang.ref.Cleaner mechanism is used to close sockets when the SocketImpl is garbage collected and the socket has not been explicitly closed.

      • Connection reset handling is implemented in the same way as the old implementation so that attempts to read after a connection reset will fail consistently.

      ServerSocket is modified to use NioSocketImpl (or PlainSocketImpl) by default. It no longer uses the SOCKS implementation.

      The SocketImpl implementations to support SOCKS and HTTP proxy servers are modified to delegate so they can work with the old and new implementations.

      The instrumentation support for socket I/O in Java Flight Recorder is modified to be independent of the SocketImpl so that socket I/O events can be recorded when running with either the new, old, or custom implementations.

      To reduce the risk of switching the implementation after more than twenty years, the old implementation will not be removed. The old implementation will remain in the JDK and a system property will be introduced to configure the JDK to use the old implementation. The JDK-specific system property to switch to the old implementation is jdk.net.usePlainSocketImpl. If set, or set to the value true, at startup, then the old implementation will be used. Some future release will remove PlainSocketImpl and the system property.

      This JEP does not propose to provide an alternative implementation of DatagramSocketImpl at this time (DatagramSocketImpl is the underlying implementation that instances of java.net.DatagramSocket delegate to). The built-in default implementation (PlainDatagramSocketImpl) is a maintenance (and porting) burden and may be the subject of another JEP.

      Testing

      The existing tests in the jdk/jdk repository will be used to test the new implementation. The jdk_net test group has accumulated many tests for networking corner case scenarios over the years. Some of the tests in this test group will be modified to run twice, the second time with -Djdk.net.usePlainSocketImpl to ensure that the old implementation does not bit-rot during the time that the JDK includes both implementations.

      A lot of code today makes direct or indirect use of libraries that use APIs defined in java.nio.channels rather than the java.net.Socket and java.net.ServerSocket APIs. Every effort will be made to create awareness of the proposal and encourage developers that have code using Socket and ServerSocket to test their code with the early-access builds that are published on jdk.java.net or elsewhere.

      The microbenchmarks in the jdk/jdk repository include benchmarks for socket read/write and streaming. These benchmarks have been improved to make it easy to compare the old and new implementations. As things stand, the new implementation is about the same or 1-3% better than the old implementation on the socket read/write tests.

      Risks and Assumptions

      The primary risk of this proposal is that there is existing code that depends on unspecified behavior in corner cases where the old and new implementations behave differently. The differences that have been identified so far are listed here; all but the first two can be mitigated by running with -Djdk.net.usePlainSocketImpl.

      • The InputStream and OutputStream returned by PlainSocketImpl's getInputStream() and getOutputStream() methods extend java.io.FileInputStream and java.io.FileOutputStream respectively. It is possible, but unlikely, that there is existing code that depends on this.

      • A ServerSocket using a custom SocketImpl cannot accept connections that return a Socket with a platform SocketImpl. Similarly, a ServerSocket using the platform SocketImpl cannot accept connections that return a Socket with a custom SocketImpl.

      • The InputStream and OutputStream returned by the old implementation tests the stream for EOF and returns -1 before other checks. The new implementation does null and bounds checks before checking if the stream is at EOF. It is possible, but unlikely, that there is fragile code that will trip up due to the order of the checks.

      • Closing a Socket with unread bytes in the receive queue will close the underlying socket gracefully. On platforms other than Microsoft Windows, the same scenario with the old implementation will lead to abortive/hard close.

      • Oracle Solaris specific: Oracle Solaris differs to other platforms in the way that it reports “connection reset” to applications. For example, calls to setsockopt or ioctl can fail when there is a network error. The xnet_skip_checks setting can be configured in /etc/system to disable this behavior (echo "xnet_skip_checks/W 1" | mdb -kw on a live system). The old implementation handles the case where ioctl(FIOREAD) fails so that attempts to read after available fails with a “connection reset” will fail consistently. This is fragile and unmaintainable, the new implementation does not attempt to emulate this behavior.

      • Oracle Solaris specific: Oracle Solaris does not allow the IPV6_TLCASS socket option to be changed on a TCP socket after it is connected. The old implementation masks this by caching the value specified to the setTrafficClass method.

      • The java.net package defines many sub-classes of SocketException. The new implementation will attempt to throw the same specific SocketException as the old implementation but there may be cases where they are not the same. Furthermore, there may be cases where the exception messages differ. On Microsoft Windows for example, the old implementation maps Windows Socket error codes to English-only messages, while the new implementation uses the system messages.

      Aside from behavioral differences, the performance of the new implementation may differ to the old when running certain workloads. In the old implementation several threads calling the accept method on a ServerSocket will queue in the kernel. In the new implementation, one thread will block in the accept system call, the others will queue waiting to acquire a java.util.concurrent lock. Performance characteristics may differ in other scenarios too.

      Finally, there may be instrumentation agents or tools that instrument the non-public java.net.SocketInputStream and java.net.SocketOutputStream classes to get I/O events. These classes are not used by the new implementation.

            alanb Alan Bateman
            alanb Alan Bateman
            Alan Bateman Alan Bateman
            Brian Goetz, Chris Hegarty, Michael McMahon
            Brian Goetz
            Votes:
            0 Vote for this issue
            Watchers:
            10 Start watching this issue

              Created:
              Updated:
              Resolved: