Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-7145846

RMI newConnection() method performs unnecessary Ping/PingAck

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Won't Fix
    • Icon: P3 P3
    • None
    • 6
    • core-libs
    • None
    • x86
    • solaris_10

      I. Problem Description:
      While performing RMI tests between a server in Paris and a client in Singapore, Murex spotted a weird behavior at the network level when calling a remote method: indeed, each call is preceded by a “PING” request, initiated by the RMI client.
      While this is not an issue when dealing with low latency network, this PING implies a 200ms overhead to each call in Murex's case. The consequences are that interactivity is significantly debased for users, that issue ~1 request per second, and, more important, trade positions display can be erroneous when remote methods are called through a listener.

      Murex expects a fix that would avoid to perform a Ping/PingAck just before the method invocation: the method TCPConnection.isDead() should not be called in the method “newConnection” of class “sun.rmi.transport.tcp.TCPChannel”.

      II. Platforms
      Solaris platform: Oracle Solaris 10 9/10 s10x_u9wos_14a X86
      Java version: "1.6.0_06"
      Java(TM) SE Runtime Environment (build 1.6.0_06-b02)
      Java HotSpot(TM) Server VM (build 10.0-b22, mixed mode)

      III. Problem Analysis:
      The method newConnection of class sun.rmi.transport.tcp.TCPChannel is called each time we invoke a remote method.
      This method uses a freelist to reuse, if possible, an existing TCP connection. When a free sun.rmi.transport.tcp.TCPConnection is found, an “isDead“ method is checked to ensure that this connection is still valid.
      This last method performs a PING against this available connection:

      int response = 0;
      try {
      o.write(TransportConstants.Ping);
      o.flush();
      response = i.read();
      } catch (IOException ex) {
      TCPTransport.tcpLog.log(Log.VERBOSE, "exception: ", ex);
      TCPTransport.tcpLog.log(Log.BRIEF, "server ping failed");
      return (true); // server failed the ping test
      }

      Since there is already a mechanism to discard inactive TCP connections (Property sun.rmi.transport.connectionTimeout a.k.a. “idleTimeout” in TCPChannel), this PING appears to be useless or at least too pessimistic.

      IV. Test case description:
      In the attached jar archive, you will find a minimal test case of the issue (sources & .class files).
      The test case consists of the following:
      1) Start a simple RMI server
      2) Start the client that will perform the following actions:
      a. Lookup the remote object
      b. Make 5 warm-up remote calls, without waiting between each call.
      c. Sleep 1 second to simulate user action (1s should be greater than 2*RTT of the first RMI call)
      d. Perform a RMI call that will be preceded by a Ping/PingAck sequence.

      On the attached image of the network traces of the test case, we notice that after opening the TCP connection and negotiating the JRMP protocol, we can perform the five first calls without the Ping/PingAck sequence.
      But for the last call, we see that a Ping/PingAck is initiated by the client, just before making our remote invocation (highlighted packet).

      How-to launch the attached test case:
      1) Launch the server:
            % java -cp PingIssueSources.jar Server
      2) Start packets’ capture on Wireshark
      3) Launch the client:
            % java -cp PingIssueSources.jar Client <Server IP>
      When performing usability tests for a Murex customer, we noticed that using RMI on a WAN between Paris and Singapore implies high response times for the end-user. Indeed, checking the TCP connection before each remote method invocation doubles the duration of each user action.

      This behavior as been introduced in 1998 for Java 1.2 (by CR 4094891) in order to address the following problem: "[...] failure because the TCP connection being cached is already dead, but RMI doesn't check until after sending the next request."

      That means that RMI is not able to differentiate a MarshalException received by the client, from an exception due to underlying TCP connection being dead.

      We believe that CR 4094891 doesn't address this issue in all cases, but minimizes the probability that the problem arises.

      Indeed, the PING is initiated only after a certain timeout (namely 2*RTT) meaning that if the exception happens before that, it won't be avoided with the current mechanism.

      The issue that CR 4094891 tried to address is a matter of exception handling (i.e. in ActivatableRef.invoke), and has not been treated that way. A more pessimistic approach has been undertaken instead: a PING is performed before each RMI call (if the last RMI call has been initiated more than 2*RTT ago).

      Our proposition here would be to perform the TCPConnection.isDead when catching the MarshalException in ActivatableRef.invoke, thus being able to differentiate a real MarshalException from the TCP connection being dead (and being able to retry transparently for the caller).

            msheppar Mark Sheppard
            cteissed Claude Teissedre (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated:
              Resolved:
              Imported:
              Indexed: