Please have the complete details available from customer below :
I am running a standalone JGroups test program that has completed and is trying to shutdown
our NIO TCP based communications connections and then the NIO Selector handling threads.
At the time of failure, we are trying to close all communication connections.
I am trying to close the connections from the Main java thread
(read + write Selector processing threads are still registered to receive event notifications for these connections).
I can probably work around the problem once I understand why the error exception is generated.
Would it be possible for you to explain what the exception means and what the Native code is trying to do?
Test Case :
============
Attached.
P.S. : capture.txt file is the text file having the logs seen at customer's setup.
Steps To Reproduce The Problem Reported :
==========================================
Please place the attached files in some folder on your Linux machine and update tcp_nio.xml to use
your machines ip address (change 2 instances of bind_addr="164.99.208.53").
Then, run 4 terminal shells and start ./sendernio.sh command in each one.
Wait about 5 seconds after starting one before the next.
The four instances will communicate with eachother for a short while and then terminate the test.
You will see the exception in some of the windows.
Also attached is capture.txt which is the output from one of my test runs.
The event sequences with Test Case are :
=========================================
1. Wait for the configured number of members to connect (num_members=4).
2. Each member sends the configured number of messages to other members (num_msgs=10000).
3. The Java main entry point is in control of starting the test and after completion, stopping it.
The main thread shuts down the group communication layer and exits.
O/S Details :
==============
Novell Linux Desktop 9 (based on Suse Linux)
uname -a output :
"Linux smarlow 2.6.5-7.243-default #1 Mon Dec 5 21:08:42
UTC 2005 i686 i686 i386 GNU/Linux"
JDK Version Details :
======================
I tried both 1.5.0_04- b05 and 1.6.0- beta- b59g.
Notice that I have included the 1.6 based output below in a previous message which had a slightly different error message (Thread signal failed).
We think the problem is caused by the way I have written the JGroups NIO code.
I would like to correct this. Previously, I was just ignoring the error but I want to fix it.
Error Messages :
==================
I am getting the following error and not sure what it really means.
It sounds like a bug between Java + native code but I'm just guessing.
The snippet of Exception stack is :
With 1.5.0_04-b05 :
--------------------
Jun 27, 2006 3:33:44 PM org.jgroups.blocks.ConnectionTableNIO$Connection closeSocket
SEVERE: error closing socket connection
java.io.IOException: Invalid argument
at sun.nio.ch.NativeThread.signal(Native Method)
at sun.nio.ch.SocketChannelImpl.implCloseSelectableChannel(SocketChannelImpl.java:634)
at java.nio.channels.spi.AbstractSelectableChannel.implCloseChannel(AbstractSelectableChannel.java:201)
at java.nio.channels.spi.AbstractInterruptibleChannel.close(AbstractInterruptibleChannel.java:97)
at org.jgroups.blocks.ConnectionTableNIO$Connection.closeSocket(ConnectionTableNIO.java:913)
With 1.6.0-beta-b59 :
-----------------------
java.io.IOException: Thread signal failed
at sun.nio.ch.NativeThread.signal(Native Method)
at sun.nio.ch.SocketChannelImpl.implCloseSelectableChannel(SocketChannel Impl.java:638)
at java.nio.channels.spi.AbstractSelectableChannel.implCloseChannel(Abst ractSelectableChannel.java:201)
at java.nio.channels.spi.AbstractInterruptibleChannel.close(AbstractInte rruptibleChannel.java:97)
at org.jgroups.blocks.ConnectionTableNIO$Connection.closeSocket(Connecti onTableNIO.java:913)
Code snippet closing the socket :
----------------------------------
The org.jgroups.blocks.ConnectionTableNIO$Connection.closeSocket looks like this:
void closeSocket()
{
if (sock_ch != null)
{
try
{
if(sock_ch.isConnected() && sock_ch.isOpen()) {
sock_ch.close();
}
}
catch (Exception e)
{
log.error("error closing socket connection", e);
}
sock_ch = null;
}
}
Possible reasons for the root cause for this problem is to be found out.
I am running a standalone JGroups test program that has completed and is trying to shutdown
our NIO TCP based communications connections and then the NIO Selector handling threads.
At the time of failure, we are trying to close all communication connections.
I am trying to close the connections from the Main java thread
(read + write Selector processing threads are still registered to receive event notifications for these connections).
I can probably work around the problem once I understand why the error exception is generated.
Would it be possible for you to explain what the exception means and what the Native code is trying to do?
Test Case :
============
Attached.
P.S. : capture.txt file is the text file having the logs seen at customer's setup.
Steps To Reproduce The Problem Reported :
==========================================
Please place the attached files in some folder on your Linux machine and update tcp_nio.xml to use
your machines ip address (change 2 instances of bind_addr="164.99.208.53").
Then, run 4 terminal shells and start ./sendernio.sh command in each one.
Wait about 5 seconds after starting one before the next.
The four instances will communicate with eachother for a short while and then terminate the test.
You will see the exception in some of the windows.
Also attached is capture.txt which is the output from one of my test runs.
The event sequences with Test Case are :
=========================================
1. Wait for the configured number of members to connect (num_members=4).
2. Each member sends the configured number of messages to other members (num_msgs=10000).
3. The Java main entry point is in control of starting the test and after completion, stopping it.
The main thread shuts down the group communication layer and exits.
O/S Details :
==============
Novell Linux Desktop 9 (based on Suse Linux)
uname -a output :
"Linux smarlow 2.6.5-7.243-default #1 Mon Dec 5 21:08:42
UTC 2005 i686 i686 i386 GNU/Linux"
JDK Version Details :
======================
I tried both 1.5.0_04- b05 and 1.6.0- beta- b59g.
Notice that I have included the 1.6 based output below in a previous message which had a slightly different error message (Thread signal failed).
We think the problem is caused by the way I have written the JGroups NIO code.
I would like to correct this. Previously, I was just ignoring the error but I want to fix it.
Error Messages :
==================
I am getting the following error and not sure what it really means.
It sounds like a bug between Java + native code but I'm just guessing.
The snippet of Exception stack is :
With 1.5.0_04-b05 :
--------------------
Jun 27, 2006 3:33:44 PM org.jgroups.blocks.ConnectionTableNIO$Connection closeSocket
SEVERE: error closing socket connection
java.io.IOException: Invalid argument
at sun.nio.ch.NativeThread.signal(Native Method)
at sun.nio.ch.SocketChannelImpl.implCloseSelectableChannel(SocketChannelImpl.java:634)
at java.nio.channels.spi.AbstractSelectableChannel.implCloseChannel(AbstractSelectableChannel.java:201)
at java.nio.channels.spi.AbstractInterruptibleChannel.close(AbstractInterruptibleChannel.java:97)
at org.jgroups.blocks.ConnectionTableNIO$Connection.closeSocket(ConnectionTableNIO.java:913)
With 1.6.0-beta-b59 :
-----------------------
java.io.IOException: Thread signal failed
at sun.nio.ch.NativeThread.signal(Native Method)
at sun.nio.ch.SocketChannelImpl.implCloseSelectableChannel(SocketChannel Impl.java:638)
at java.nio.channels.spi.AbstractSelectableChannel.implCloseChannel(Abst ractSelectableChannel.java:201)
at java.nio.channels.spi.AbstractInterruptibleChannel.close(AbstractInte rruptibleChannel.java:97)
at org.jgroups.blocks.ConnectionTableNIO$Connection.closeSocket(Connecti onTableNIO.java:913)
Code snippet closing the socket :
----------------------------------
The org.jgroups.blocks.ConnectionTableNIO$Connection.closeSocket looks like this:
void closeSocket()
{
if (sock_ch != null)
{
try
{
if(sock_ch.isConnected() && sock_ch.isOpen()) {
sock_ch.close();
}
}
catch (Exception e)
{
log.error("error closing socket connection", e);
}
sock_ch = null;
}
}
Possible reasons for the root cause for this problem is to be found out.
- duplicates
-
JDK-6380091 IOException/SIGSEGV occurs during SocketChannel.close processing.
- Resolved
- relates to
-
JDK-6285901 (so) Data corruption with asynchronous close (Solaris/Linux)
- Resolved