Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-6378870

Confusing error "java.net.SocketException: Invalid argument" for socket disconnection

XMLWordPrintable

    • b05
    • x86, sparc
    • solaris_9, windows_2008
    • Not verified

        * socket.setTcpNoDelay(tcpNoDelay) reported the following error:

        ERROR [org.apache.tomcat.util.net.PoolTcpEndpoint] Socket error caused by

        remote host /167.10.54.100
        java.net.SocketException: Invalid argument
                at java.net.PlainSocketImpl.socketSetOption(Native Method)
                at java.net.PlainSocketImpl.setOption(Unknown Source)
                at java.net.Socket.setTcpNoDelay(Unknown Source)
                at org.apache.tomcat.util.net.PoolTcpEndpoint.setSocketOptions(PoolTcpEn
        dpoint.java503)
                at org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpo
        int.java:515)
                at rg.apache.tomcat.util.net.MasterSlaveWorkerThread.run(MasterSlaveWork
        erThread.java:112)
                at java.lang.Thread.run(Unknown Source)

        * corresponding truss output:

        /182: setsockopt(244, tcp, TCP_NODELAY, 0xFFFFFFFE907FEE80, 4, 1) Err#22 EINVAL
        /182: write(11, " 2 0 0 5 - 1 2 - 2 2 2".., 644) = 644
        /182: Incurred fault #6, FLTBOUNDS %pc = 0xFFFFFFFF3905CCC0
        /182: siginfo: SIGSEGV SEGV_MAPERR addr=0x00000008
        /182: Received signal #11, SIGSEGV [caught]
        /182: siginfo: SIGSEGV SEGV_MAPERR addr=0x00000008
         
        /180: setsockopt(31, tcp, TCP_NODELAY, 0xFFFFFFFE90DFED80, 4, 1) Err#22 EINVAL
        /180: write(11, " 2 0 0 5 - 1 2 - 2 2 2".., 644) = 644
        /180: sysinfo(SI_HOSTNAME, "i240", 256) = 5
        /180: door_info(4, 0xFFFFFFFE90DFB528) = 0

        * JBoss support evaluated the problem and recommended that:

        "we recommend Sun take a look at it to prevent further
        confusion for others later. Tomcat developers have already agreed to modify
        Tomcat to ignore your error message when running in a Solaris environment. This
        change should make it into the next revision of Tomcat.

        The problem seems to be specific to Solaris, and is just that Solaris reports an
         EINVAL when most other implementations do not. Apparently, this behavior wasn'
        t documented until Solaris 9, and that's why it wasn't accounted for. Foo.java
        (written by JBoss Support) demonstrates the issue.

        At a minimum, our team recommends updating Java documentation to note this condi-
        tion when running in Solaris. It would be cute if the JVM could know that Solaris
        behaves that way and react accordingly."

        * Here's evaluation:

        "The EINVAL is in response to a TCP RST sent by the content switch. The content
        switch sent a TCP RST because Tomcat couldn't respond within the 3 seconds allow
        ed by the content switch. Tomcat couldn't respond in time because of Garbage Co
        llector was going wild. The Garbage Collector was doing tons of work in respons
        e to an application bug.

        Therefore, you don't care why the EINVAL was there. For the most part, that has
         been accounted for (there was still one unexplained occurrence. We'll update y
        ou if we find any evidence regarding that.)

        Your concern is just the JVM's reaction to the EINVAL in a Solaris environment,
        should Sun care to pursue it.

        2. The Sockets API in Java is not truly portable because it still closely mirro
        rs the behavior of the OS's internal socket implementation. The root of the prob
        lem is that Solaris is unique in that calls to setsockopt can result in an EINVA
        L if the underlying connection has closed. This behavior was actually not docume
        nted on Solaris 8, they did finally document it in Solaris 9.

        So, The JVM does not know the reason for the EINVAL, and thus it just passes it
        up to the Java application as a SocketException. So they really aren't doing any
        thing wrong (since it is Solaris that is doing it). I would recommend sending th
        em Foo.java in case they want to add special code that relays a different messag
        e, or maybe they want to update the documentation to Socket.set*() to indicate t
        he behavior on Solaris.

        3. Tomcat treated SocketExceptions that occur on Socket.setTcpNoDelay() (and oth
        ers) as an error instead of a normal condition. This is because of the following
        :

        1. Most platforms do not return an error on calls to setsockopt
        2. Solaris does do this, but it was not documented at the time the JVM and tomca
        t were developed.
        3. The tomcat error was difficult to reproduce, because it only occurs when a cl
        ient quickly closes its connection between the initial call to accept() and the
        first call to setsockopt(). (This information was of course not known when the p
        roblem was reported in the past, because no one has been able to gather the data
         that shows how it occurs until now)
        4. EINVAL is usually used to indicate a bad argument was passed to the call (in
        fact this is what the Solaris 8 documentation says). This gives one the impressi
        on of something wrong in the JVM, because it is the JVM's responsibility to pass
         correct data structures to OS system calls.

        So, while this condition is rare, it is still normal, and so future versions of
        tomcat will treat it as such, and no longer log it."


        * testcase

        ----------------Foo.java---------------------
        import java.io.IOException;
        import java.net.ServerSocket;
        import java.net.Socket;


        public class Foo implements Runnable
        {
           public int turn = SERVER;
           public static final int SERVER = 1;
           public static final int CLIENT = 2;

           public static void main(String[] args) throws Exception
           {
              ServerSocket server = null;
              Socket client = null;
                  try{
                server = new ServerSocket(4444);
              } catch (IOException e) {
                System.out.println("Could not listen on port 4444");
                System.exit(-1);
              }

              Foo foo = new Foo();
              new Thread(foo).start();

              try{
                client = server.accept();
              } catch (IOException e) {
                 System.out.println("Accept failed: " + e);
                 System.exit(-1);
              }

              System.out.println("Accepted Socket");
              foo.handOff(CLIENT);
              foo.waitFor(SERVER);

              System.out.println("Setting TCP NO_DELAY");

              // this will throw EINVAL on solaris
              client.setTcpNoDelay(false);

              // on all other OS's you will see a connection reset error here
              client.getInputStream().read();
              server.close();
           }

           public synchronized void waitFor(int who) throws InterruptedException
           {
              while (turn != who)
                 wait();
           }

           public synchronized void handOff (int who) throws InterruptedException
           {
              turn = who;
              notify();
           }
           public void run()
           {
              try
              {
                 Socket socket = new Socket("localhost", 4444);
                 waitFor(CLIENT);
                 System.out.println("Sending RST!");
                 socket.setSoLinger(true, 0);
                 socket.close();
                 handOff(SERVER);
              }
              catch (Exception e)
              {
                 throw new RuntimeException(e);
              }
           }
        }
        ----------------Foo.java---------------------

              jccollet Jean-Christophe Collet (Inactive)
              lkchow Lawrence Chow
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved:
                Imported:
                Indexed: