The fix for 4094891 needs to be backported to JDK1.1.8. Customers from Adobe have run received an Unmarshal exception when a registry process is restarted and the cached connection to the registry is reused. JDK1.1.8 does not check to make sure that the given connection is still alive.
The following email outlines the situation at Adobe:
------- start of forwarded message (RFC 934 encapsulation) -------
From: Gunawan Herri <###@###.###>
To: ###@###.###, ###@###.###
Subject: question regarding Naming.lookup
Date: Wed, 05 Aug 1998 11:40:43 -0700
Dear Ann,
I get your name from rmi-users mailing list. I'm working for an Adobe
division that is working on an application exclusively written in Java.
We use RMI extensively.
This is the problem that I encounter and I hope you can shed some lights
about it.
My apology if you're not the right person for this, and please route it
to appropriate people.
------------
- --------------------------------- ----------------
| client | ---------------> | LoadBalancingMachine |
- ------------ | ServerOne |
------------
- --------------------------------- -----------------
| -----------------
- ----------------------| ServerTwo |
- ------------------
We try to using a product called HydraWeb that perform load balancing.
When you connect to it, it will poll servers that it monitors (in this
case: ServerOne and ServerTwo machines) and find out which one is
available. It will then route caller to appropriate machine.
The client machine does Naming.lookup using
rmi://LoadBalancingMachine/testObject. Let say the loadbalancingmachine
route the caller to ServerOne.
When I print the UnicastRef object, it says ServerOne, so Naming.lookup
and LoadBalancingMachine do their job correctly.
Then i purposely killed rmiregistry and processes on ServerOne. On the
ClientSide, my testClient will try to catch ConnectException and/or
UnmarshallException. When it tries to call a remote method and get those
exceptions, it will try to do another Naming.lookup to
LoadBalancingMachine, which should be routed to ServerTwo (since it is
the only one that is available now).
The problem that I encounter is: The client need to do 'Naming.lookup'
at least 2 times before it succeeded in getting the RemoteObject back.
It is not the loadbalancing machine problem because as soon as ServerOne
is killed and if we invoke a *fresh* new copy of client it will connect
directly to ServerTwo with no problem while the original client is
failing.
Seems like RMI is caching the information about
rmi://LoadBalancingMachine/testObject somewhere that causing it to get
UnmarshallException..
The same situation occurs even if I sleep for long time (more than 30
seconds) before I retry the Naming.lookup
Any ideas of how to approach this problem?
Adobe does license Java from Sun and we do have Sun's source code.
Thanks, Ann.
cheers
- -herri
The stack of the exception:
in testRMImain:unmarshall
java.rmi.UnmarshalException: Error unmarshaling return header
at
sun.rmi.transport.StreamRemoteCall.executeCall(StreamRemoteCall.java:
208)
at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:104)
at
sun.rmi.registry.RegistryImpl_Stub.lookup(RegistryImpl_Stub.java:94)
at java.rmi.Naming.lookup(Naming.java:60)
at testRMI$Client.run(testRMI.java:83)
at java.lang.Thread.run(Thread.java:474)
Sleep for: 11000 ms
The following email outlines the situation at Adobe:
------- start of forwarded message (RFC 934 encapsulation) -------
From: Gunawan Herri <###@###.###>
To: ###@###.###, ###@###.###
Subject: question regarding Naming.lookup
Date: Wed, 05 Aug 1998 11:40:43 -0700
Dear Ann,
I get your name from rmi-users mailing list. I'm working for an Adobe
division that is working on an application exclusively written in Java.
We use RMI extensively.
This is the problem that I encounter and I hope you can shed some lights
about it.
My apology if you're not the right person for this, and please route it
to appropriate people.
------------
- --------------------------------- ----------------
| client | ---------------> | LoadBalancingMachine |
- ------------ | ServerOne |
------------
- --------------------------------- -----------------
| -----------------
- ----------------------| ServerTwo |
- ------------------
We try to using a product called HydraWeb that perform load balancing.
When you connect to it, it will poll servers that it monitors (in this
case: ServerOne and ServerTwo machines) and find out which one is
available. It will then route caller to appropriate machine.
The client machine does Naming.lookup using
rmi://LoadBalancingMachine/testObject. Let say the loadbalancingmachine
route the caller to ServerOne.
When I print the UnicastRef object, it says ServerOne, so Naming.lookup
and LoadBalancingMachine do their job correctly.
Then i purposely killed rmiregistry and processes on ServerOne. On the
ClientSide, my testClient will try to catch ConnectException and/or
UnmarshallException. When it tries to call a remote method and get those
exceptions, it will try to do another Naming.lookup to
LoadBalancingMachine, which should be routed to ServerTwo (since it is
the only one that is available now).
The problem that I encounter is: The client need to do 'Naming.lookup'
at least 2 times before it succeeded in getting the RemoteObject back.
It is not the loadbalancing machine problem because as soon as ServerOne
is killed and if we invoke a *fresh* new copy of client it will connect
directly to ServerTwo with no problem while the original client is
failing.
Seems like RMI is caching the information about
rmi://LoadBalancingMachine/testObject somewhere that causing it to get
UnmarshallException..
The same situation occurs even if I sleep for long time (more than 30
seconds) before I retry the Naming.lookup
Any ideas of how to approach this problem?
Adobe does license Java from Sun and we do have Sun's source code.
Thanks, Ann.
cheers
- -herri
The stack of the exception:
in testRMImain:unmarshall
java.rmi.UnmarshalException: Error unmarshaling return header
at
sun.rmi.transport.StreamRemoteCall.executeCall(StreamRemoteCall.java:
208)
at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:104)
at
sun.rmi.registry.RegistryImpl_Stub.lookup(RegistryImpl_Stub.java:94)
at java.rmi.Naming.lookup(Naming.java:60)
at testRMI$Client.run(testRMI.java:83)
at java.lang.Thread.run(Thread.java:474)
Sleep for: 11000 ms
- relates to
-
JDK-4094891 activation: unable to retry call if cached connection to server is used
-
- Closed
-