If an exception occurs while unmarshalling a remote call's arguments, UnicastServerRef.dispatch marshals the exception back to the client and then returns, without consuming any more of the argument data than had been consumed at the point of the exception. Assuming that the exception was marshalled successfully, then TCPTransport.handleMessages will attempt to read another message from the JRMP connection-- which is a dubious thing to do in this situation (see 4415668), because in all likelihood, the attempt will fail quickly with some protocol error, because it will be reading from the input stream left in the state that it was at time of the unmarshalling exception, somewhere in the middle of a partially unmarshalled argument. The eventual reaction to this protocol error will cause the socket to be fully closed (for output and input).
If the client was marshalling a very large amount of argument data, such that it is still doing socket writes for the argument data after the server has performed a full close of its socket, then the server side's TCP implementation will send a TCP reset to the client upon recipt of such a write, and if the client attempts another write after receipt of this TCP reset, then the client side's TCP implementation will cause that write operation to fail with a "Connection reset by peer" indication (or equivalent). This will cause the client-side RMI implementation consider the remote call to have failed with a java.rmi.MarshalException wrapping the IOException for the "Connection reset by peer".
The client-side RMI implementation will never bother to read the original unmarshalling exception that had caused the problem and that the server-side RMI implementation had so nicely sent for the client's benefit-- thus making the original exception, which caused the ultimate failure, difficult to debug.
This difficulty has been the cause of numerous problems reported by users, such as on the RMI-USERS and JINI-USERS lists and the rmi-comments alias.
On Linux, the client-side failure might look like this:
java.rmi.MarshalException: error marshalling arguments; nested exception is:
java.net.SocketException: Connection reset by peer: socket write error
and on Windows, it might look like this:
java.rmi.MarshalException: error marshalling arguments; nested exception is:
java.net.SocketException: Software caused connection abort: socket write error
or something similar.
The same problem can occur if the server side throws a NoSuchObjectException upon reading the invocation's object ID, which occurs before even starting to unmarshal the arguments-- then none of the argument data will be consumed until it is bogusly interpreted as the next transport-level message, and the server side will close the connection. If the argument data is very large, then the same problem described above will occur, with the client seeing an apparent network problem instead of the root cause NoSuchObjectException.
This NoSuchObjectException case may seem more qualitatively problematic than the unmarhsalling failure case, because in certain situations NoSuchObjectException can be more of an "expected" remote invocation failure mode, like with various schemes involving remote objects that can go away and be restarted. Also, a remote invocation that throws NoSuchObjectException can be considered "safe" to retry without violating at-most-once execution semantics, but MarshalException and UnmarshalException, in general, cannot-- therefore, in the NoSuchObjectException case, this bug causes a safe-to-retry failure to appear like an unsafe-to-retry failure, which is unfortunate.
If the client was marshalling a very large amount of argument data, such that it is still doing socket writes for the argument data after the server has performed a full close of its socket, then the server side's TCP implementation will send a TCP reset to the client upon recipt of such a write, and if the client attempts another write after receipt of this TCP reset, then the client side's TCP implementation will cause that write operation to fail with a "Connection reset by peer" indication (or equivalent). This will cause the client-side RMI implementation consider the remote call to have failed with a java.rmi.MarshalException wrapping the IOException for the "Connection reset by peer".
The client-side RMI implementation will never bother to read the original unmarshalling exception that had caused the problem and that the server-side RMI implementation had so nicely sent for the client's benefit-- thus making the original exception, which caused the ultimate failure, difficult to debug.
This difficulty has been the cause of numerous problems reported by users, such as on the RMI-USERS and JINI-USERS lists and the rmi-comments alias.
On Linux, the client-side failure might look like this:
java.rmi.MarshalException: error marshalling arguments; nested exception is:
java.net.SocketException: Connection reset by peer: socket write error
and on Windows, it might look like this:
java.rmi.MarshalException: error marshalling arguments; nested exception is:
java.net.SocketException: Software caused connection abort: socket write error
or something similar.
The same problem can occur if the server side throws a NoSuchObjectException upon reading the invocation's object ID, which occurs before even starting to unmarshal the arguments-- then none of the argument data will be consumed until it is bogusly interpreted as the next transport-level message, and the server side will close the connection. If the argument data is very large, then the same problem described above will occur, with the client seeing an apparent network problem instead of the root cause NoSuchObjectException.
This NoSuchObjectException case may seem more qualitatively problematic than the unmarhsalling failure case, because in certain situations NoSuchObjectException can be more of an "expected" remote invocation failure mode, like with various schemes involving remote objects that can go away and be restarted. Also, a remote invocation that throws NoSuchObjectException can be considered "safe" to retry without violating at-most-once execution semantics, but MarshalException and UnmarshalException, in general, cannot-- therefore, in the NoSuchObjectException case, this bug causes a safe-to-retry failure to appear like an unsafe-to-retry failure, which is unfortunate.
- relates to
-
JDK-4415668 UnmarshalException can leave conn in invalid state
-
- Open
-