-
Bug
-
Resolution: Fixed
-
P4
-
5.0u5, 6u20
-
b05
-
x86, sparc
-
solaris_9, windows_2008
-
Not verified
Issue | Fix Version | Assignee | Priority | Status | Resolution | Resolved In Build |
---|---|---|---|---|---|---|
JDK-2204970 | 6u25 | Sunita Koppar | P3 | Closed | Fixed | b01 |
JDK-2204264 | 6u24-rev | Sunita Koppar | P3 | Resolved | Fixed | b22 |
JDK-2195037 | 6u22-rev | Sunita Koppar | P3 | Closed | Fixed | b07 |
* socket.setTcpNoDelay(tcpNoDelay) reported the following error:
ERROR [org.apache.tomcat.util.net.PoolTcpEndpoint] Socket error caused by
remote host /167.10.54.100
java.net.SocketException: Invalid argument
at java.net.PlainSocketImpl.socketSetOption(Native Method)
at java.net.PlainSocketImpl.setOption(Unknown Source)
at java.net.Socket.setTcpNoDelay(Unknown Source)
at org.apache.tomcat.util.net.PoolTcpEndpoint.setSocketOptions(PoolTcpEn
dpoint.java503)
at org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpo
int.java:515)
at rg.apache.tomcat.util.net.MasterSlaveWorkerThread.run(MasterSlaveWork
erThread.java:112)
at java.lang.Thread.run(Unknown Source)
* corresponding truss output:
/182: setsockopt(244, tcp, TCP_NODELAY, 0xFFFFFFFE907FEE80, 4, 1) Err#22 EINVAL
/182: write(11, " 2 0 0 5 - 1 2 - 2 2 2".., 644) = 644
/182: Incurred fault #6, FLTBOUNDS %pc = 0xFFFFFFFF3905CCC0
/182: siginfo: SIGSEGV SEGV_MAPERR addr=0x00000008
/182: Received signal #11, SIGSEGV [caught]
/182: siginfo: SIGSEGV SEGV_MAPERR addr=0x00000008
/180: setsockopt(31, tcp, TCP_NODELAY, 0xFFFFFFFE90DFED80, 4, 1) Err#22 EINVAL
/180: write(11, " 2 0 0 5 - 1 2 - 2 2 2".., 644) = 644
/180: sysinfo(SI_HOSTNAME, "i240", 256) = 5
/180: door_info(4, 0xFFFFFFFE90DFB528) = 0
* JBoss support evaluated the problem and recommended that:
"we recommend Sun take a look at it to prevent further
confusion for others later. Tomcat developers have already agreed to modify
Tomcat to ignore your error message when running in a Solaris environment. This
change should make it into the next revision of Tomcat.
The problem seems to be specific to Solaris, and is just that Solaris reports an
EINVAL when most other implementations do not. Apparently, this behavior wasn'
t documented until Solaris 9, and that's why it wasn't accounted for. Foo.java
(written by JBoss Support) demonstrates the issue.
At a minimum, our team recommends updating Java documentation to note this condi-
tion when running in Solaris. It would be cute if the JVM could know that Solaris
behaves that way and react accordingly."
* Here's evaluation:
"The EINVAL is in response to a TCP RST sent by the content switch. The content
switch sent a TCP RST because Tomcat couldn't respond within the 3 seconds allow
ed by the content switch. Tomcat couldn't respond in time because of Garbage Co
llector was going wild. The Garbage Collector was doing tons of work in respons
e to an application bug.
Therefore, you don't care why the EINVAL was there. For the most part, that has
been accounted for (there was still one unexplained occurrence. We'll update y
ou if we find any evidence regarding that.)
Your concern is just the JVM's reaction to the EINVAL in a Solaris environment,
should Sun care to pursue it.
2. The Sockets API in Java is not truly portable because it still closely mirro
rs the behavior of the OS's internal socket implementation. The root of the prob
lem is that Solaris is unique in that calls to setsockopt can result in an EINVA
L if the underlying connection has closed. This behavior was actually not docume
nted on Solaris 8, they did finally document it in Solaris 9.
So, The JVM does not know the reason for the EINVAL, and thus it just passes it
up to the Java application as a SocketException. So they really aren't doing any
thing wrong (since it is Solaris that is doing it). I would recommend sending th
em Foo.java in case they want to add special code that relays a different messag
e, or maybe they want to update the documentation to Socket.set*() to indicate t
he behavior on Solaris.
3. Tomcat treated SocketExceptions that occur on Socket.setTcpNoDelay() (and oth
ers) as an error instead of a normal condition. This is because of the following
:
1. Most platforms do not return an error on calls to setsockopt
2. Solaris does do this, but it was not documented at the time the JVM and tomca
t were developed.
3. The tomcat error was difficult to reproduce, because it only occurs when a cl
ient quickly closes its connection between the initial call to accept() and the
first call to setsockopt(). (This information was of course not known when the p
roblem was reported in the past, because no one has been able to gather the data
that shows how it occurs until now)
4. EINVAL is usually used to indicate a bad argument was passed to the call (in
fact this is what the Solaris 8 documentation says). This gives one the impressi
on of something wrong in the JVM, because it is the JVM's responsibility to pass
correct data structures to OS system calls.
So, while this condition is rare, it is still normal, and so future versions of
tomcat will treat it as such, and no longer log it."
* testcase
----------------Foo.java---------------------
import java.io.IOException;
import java.net.ServerSocket;
import java.net.Socket;
public class Foo implements Runnable
{
public int turn = SERVER;
public static final int SERVER = 1;
public static final int CLIENT = 2;
public static void main(String[] args) throws Exception
{
ServerSocket server = null;
Socket client = null;
try{
server = new ServerSocket(4444);
} catch (IOException e) {
System.out.println("Could not listen on port 4444");
System.exit(-1);
}
Foo foo = new Foo();
new Thread(foo).start();
try{
client = server.accept();
} catch (IOException e) {
System.out.println("Accept failed: " + e);
System.exit(-1);
}
System.out.println("Accepted Socket");
foo.handOff(CLIENT);
foo.waitFor(SERVER);
System.out.println("Setting TCP NO_DELAY");
// this will throw EINVAL on solaris
client.setTcpNoDelay(false);
// on all other OS's you will see a connection reset error here
client.getInputStream().read();
server.close();
}
public synchronized void waitFor(int who) throws InterruptedException
{
while (turn != who)
wait();
}
public synchronized void handOff (int who) throws InterruptedException
{
turn = who;
notify();
}
public void run()
{
try
{
Socket socket = new Socket("localhost", 4444);
waitFor(CLIENT);
System.out.println("Sending RST!");
socket.setSoLinger(true, 0);
socket.close();
handOff(SERVER);
}
catch (Exception e)
{
throw new RuntimeException(e);
}
}
}
----------------Foo.java---------------------
ERROR [org.apache.tomcat.util.net.PoolTcpEndpoint] Socket error caused by
remote host /167.10.54.100
java.net.SocketException: Invalid argument
at java.net.PlainSocketImpl.socketSetOption(Native Method)
at java.net.PlainSocketImpl.setOption(Unknown Source)
at java.net.Socket.setTcpNoDelay(Unknown Source)
at org.apache.tomcat.util.net.PoolTcpEndpoint.setSocketOptions(PoolTcpEn
dpoint.java503)
at org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpo
int.java:515)
at rg.apache.tomcat.util.net.MasterSlaveWorkerThread.run(MasterSlaveWork
erThread.java:112)
at java.lang.Thread.run(Unknown Source)
* corresponding truss output:
/182: setsockopt(244, tcp, TCP_NODELAY, 0xFFFFFFFE907FEE80, 4, 1) Err#22 EINVAL
/182: write(11, " 2 0 0 5 - 1 2 - 2 2 2".., 644) = 644
/182: Incurred fault #6, FLTBOUNDS %pc = 0xFFFFFFFF3905CCC0
/182: siginfo: SIGSEGV SEGV_MAPERR addr=0x00000008
/182: Received signal #11, SIGSEGV [caught]
/182: siginfo: SIGSEGV SEGV_MAPERR addr=0x00000008
/180: setsockopt(31, tcp, TCP_NODELAY, 0xFFFFFFFE90DFED80, 4, 1) Err#22 EINVAL
/180: write(11, " 2 0 0 5 - 1 2 - 2 2 2".., 644) = 644
/180: sysinfo(SI_HOSTNAME, "i240", 256) = 5
/180: door_info(4, 0xFFFFFFFE90DFB528) = 0
* JBoss support evaluated the problem and recommended that:
"we recommend Sun take a look at it to prevent further
confusion for others later. Tomcat developers have already agreed to modify
Tomcat to ignore your error message when running in a Solaris environment. This
change should make it into the next revision of Tomcat.
The problem seems to be specific to Solaris, and is just that Solaris reports an
EINVAL when most other implementations do not. Apparently, this behavior wasn'
t documented until Solaris 9, and that's why it wasn't accounted for. Foo.java
(written by JBoss Support) demonstrates the issue.
At a minimum, our team recommends updating Java documentation to note this condi-
tion when running in Solaris. It would be cute if the JVM could know that Solaris
behaves that way and react accordingly."
* Here's evaluation:
"The EINVAL is in response to a TCP RST sent by the content switch. The content
switch sent a TCP RST because Tomcat couldn't respond within the 3 seconds allow
ed by the content switch. Tomcat couldn't respond in time because of Garbage Co
llector was going wild. The Garbage Collector was doing tons of work in respons
e to an application bug.
Therefore, you don't care why the EINVAL was there. For the most part, that has
been accounted for (there was still one unexplained occurrence. We'll update y
ou if we find any evidence regarding that.)
Your concern is just the JVM's reaction to the EINVAL in a Solaris environment,
should Sun care to pursue it.
2. The Sockets API in Java is not truly portable because it still closely mirro
rs the behavior of the OS's internal socket implementation. The root of the prob
lem is that Solaris is unique in that calls to setsockopt can result in an EINVA
L if the underlying connection has closed. This behavior was actually not docume
nted on Solaris 8, they did finally document it in Solaris 9.
So, The JVM does not know the reason for the EINVAL, and thus it just passes it
up to the Java application as a SocketException. So they really aren't doing any
thing wrong (since it is Solaris that is doing it). I would recommend sending th
em Foo.java in case they want to add special code that relays a different messag
e, or maybe they want to update the documentation to Socket.set*() to indicate t
he behavior on Solaris.
3. Tomcat treated SocketExceptions that occur on Socket.setTcpNoDelay() (and oth
ers) as an error instead of a normal condition. This is because of the following
:
1. Most platforms do not return an error on calls to setsockopt
2. Solaris does do this, but it was not documented at the time the JVM and tomca
t were developed.
3. The tomcat error was difficult to reproduce, because it only occurs when a cl
ient quickly closes its connection between the initial call to accept() and the
first call to setsockopt(). (This information was of course not known when the p
roblem was reported in the past, because no one has been able to gather the data
that shows how it occurs until now)
4. EINVAL is usually used to indicate a bad argument was passed to the call (in
fact this is what the Solaris 8 documentation says). This gives one the impressi
on of something wrong in the JVM, because it is the JVM's responsibility to pass
correct data structures to OS system calls.
So, while this condition is rare, it is still normal, and so future versions of
tomcat will treat it as such, and no longer log it."
* testcase
----------------Foo.java---------------------
import java.io.IOException;
import java.net.ServerSocket;
import java.net.Socket;
public class Foo implements Runnable
{
public int turn = SERVER;
public static final int SERVER = 1;
public static final int CLIENT = 2;
public static void main(String[] args) throws Exception
{
ServerSocket server = null;
Socket client = null;
try{
server = new ServerSocket(4444);
} catch (IOException e) {
System.out.println("Could not listen on port 4444");
System.exit(-1);
}
Foo foo = new Foo();
new Thread(foo).start();
try{
client = server.accept();
} catch (IOException e) {
System.out.println("Accept failed: " + e);
System.exit(-1);
}
System.out.println("Accepted Socket");
foo.handOff(CLIENT);
foo.waitFor(SERVER);
System.out.println("Setting TCP NO_DELAY");
// this will throw EINVAL on solaris
client.setTcpNoDelay(false);
// on all other OS's you will see a connection reset error here
client.getInputStream().read();
server.close();
}
public synchronized void waitFor(int who) throws InterruptedException
{
while (turn != who)
wait();
}
public synchronized void handOff (int who) throws InterruptedException
{
turn = who;
notify();
}
public void run()
{
try
{
Socket socket = new Socket("localhost", 4444);
waitFor(CLIENT);
System.out.println("Sending RST!");
socket.setSoLinger(true, 0);
socket.close();
handOff(SERVER);
}
catch (Exception e)
{
throw new RuntimeException(e);
}
}
}
----------------Foo.java---------------------
- backported by
-
JDK-2204264 Confusing error "java.net.SocketException: Invalid argument" for socket disconnection
- Resolved
-
JDK-2195037 Confusing error "java.net.SocketException: Invalid argument" for socket disconnection
- Closed
-
JDK-2204970 Confusing error "java.net.SocketException: Invalid argument" for socket disconnection
- Closed
- relates to
-
JDK-6997841 TEST_BUG: java/net/PlainSocketImpl/MisleadingSolarisExc.java test failes on non-solaris systems
- Resolved