Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8250521

Configure initial RTO to use minimal retry for loopback connections on Windows

XMLWordPrintable

    • b11
    • generic
    • windows_10

        Problem statement
        By default Windows has 2 seconds timeout when connecting to socket connections. This timeout is equally applied to external addresses and the loopback adapter. It has been a design choice by the Windows team for some time now and changing this behaviour would be a breaking change.

        This implementation causes a performance issue for services that use this methodology for service discovery, e.g services in a cluster. Typically, the parts of the service clusters run periodical socket connect beacons as specified in the cluster configuration files.

        This behaviour is present in the Gradle/Kotlin compile daemons, where there could be multiple of these daemons running on a given system working together to faster compile the project files.

        The solution to the performance issue
        As suggested by the Windows Networking Engineers, when connecting to a socket on the loopback adapter we can shorten the default socket timeout and avoid retries. This particular solution helps resolve the problem for localhost connections and helps with the Gradle build systems performance issues.

        To fix the issue we can use the SIO_TCP_INITIAL_RTO flag to override the socket defaults.


        Webrev: https://cr.openjdk.java.net/~adityam/nikola/fast_connect_loopback_3/

        Testcase
        To demonstrate the problem I have written a simple socket connect benchmark:

        import java.io.IOException;
        import java.net.Inet6Address;
        import java.net.InetAddress;
        import java.rmi.ConnectException;
        import java.rmi.ConnectIOException;
        import java.rmi.registry.LocateRegistry;
        import java.rmi.server.RMISocketFactory;

        public class Test {

            final String IPV4_LOOPBACK_INET_ADDRESS = "127.0.0.1";
            final String IPV6_LOOPBACK_INET_ADDRESS = "::1";

            public static void main(String[] args) {
                Test test = new Test();

                test.measureFailedConnectToAddress(test.loopbackInetAddressName());
                test.measureFailedConnectToAddress("192.168.1.1");
            }

            void measureFailedConnectToAddress(String address) {
                System.out.print("Benchmarking for address " + address);
                long startTime = System.currentTimeMillis();
                for (int i = 0; i < 10; i++) {
                    tryConnectToDaemon(address, 12345 + i);
                }

                System.out.println(", elapsed time " + (System.currentTimeMillis() - startTime) + " ms");
            }

            String loopbackInetAddressName() {
                try {
                    if (InetAddress.getByName(null) instanceof Inet6Address) {
                        return IPV6_LOOPBACK_INET_ADDRESS;
                     } else {
                        return IPV4_LOOPBACK_INET_ADDRESS;
                     }
                } catch (IOException e) {
                    // getLocalHost may fail for unknown reasons in some situations, the fallback is to assume IPv4 for now
                    return IPV4_LOOPBACK_INET_ADDRESS;
                }
            }

            private void tryConnectToDaemon(String address, int port) {
                RMISocketFactory defaultFactory =
                        RMISocketFactory.getDefaultSocketFactory();
                try {
                    LocateRegistry.getRegistry(
                            address,
                            port,
                            defaultFactory).lookup("KotlinJvmCompilerService");
                } catch (ConnectException | ConnectIOException e) {
                } catch (Exception x) {
                    throw new RuntimeException(x);
                }
            }
        }
        Performance results
        Before the change:

        Benchmarking for address 127.0.0.1, elapsed time 20687 ms
        Benchmarking for address 192.168.1.1, elapsed time 20574 ms
        After the change:

        Benchmarking for address 127.0.0.1, elapsed time 356 ms
        Benchmarking for address 192.168.1.1, elapsed time 21113 ms
        Linux OS for comparison:

        Benchmarking for address 127.0.0.1, elapsed time 42 ms
        Benchmarking for address 192.168.1.1, elapsed time 59 ms

              dgrieve David Grieve
              dgrieve David Grieve
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated:
                Resolved: