I looks like there is a bug in Solaris2.4's handling of writes to
sockets in non-blocking mode. This bug does not occur on 2.3.
I saw this when trying to write about 40K to a socket. Often if the
other side was slow at reading the data only the first 8K would make it
across. Then java would hang. The socket is in non-blocking mode so a
signal should be sent when it is possible to write data to the socket.
However on 2.4 the system would consistently fail to send the SIGIO (or
SIGPOLL). This was confirmed by running truss and watching writes and
signals. On 2.3 the appropriate signal would be send and everything
would be fine.
It is likely that Sami has also seen a manifestation of this problem.
He has seen WebRunner hang occasionally (2-3 times a day) in
ImageCreate on his 2.4 system. That could be explained by this
problem. He is going to try with the sugested fix in place to see if
the problem goes away.
sockets in non-blocking mode. This bug does not occur on 2.3.
I saw this when trying to write about 40K to a socket. Often if the
other side was slow at reading the data only the first 8K would make it
across. Then java would hang. The socket is in non-blocking mode so a
signal should be sent when it is possible to write data to the socket.
However on 2.4 the system would consistently fail to send the SIGIO (or
SIGPOLL). This was confirmed by running truss and watching writes and
signals. On 2.3 the appropriate signal would be send and everything
would be fine.
It is likely that Sami has also seen a manifestation of this problem.
He has seen WebRunner hang occasionally (2-3 times a day) in
ImageCreate on his 2.4 system. That could be explained by this
problem. He is going to try with the sugested fix in place to see if
the problem goes away.