Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-4179605

(1.1.x) Unmarshal exception when rmiregistry is restarted.

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: P3 P3
    • 1.1.8
    • 1.1.7
    • core-libs
    • 1.1.8
    • generic
    • generic
    • Verified

      The fix for 4094891 needs to be backported to JDK1.1.8. Customers from Adobe have run received an Unmarshal exception when a registry process is restarted and the cached connection to the registry is reused. JDK1.1.8 does not check to make sure that the given connection is still alive.


      The following email outlines the situation at Adobe:

      ------- start of forwarded message (RFC 934 encapsulation) -------
      From: Gunawan Herri <###@###.###>
      To: ###@###.###, ###@###.###
      Subject: question regarding Naming.lookup
      Date: Wed, 05 Aug 1998 11:40:43 -0700


      Dear Ann,
      I get your name from rmi-users mailing list. I'm working for an Adobe
      division that is working on an application exclusively written in Java.
      We use RMI extensively.


      This is the problem that I encounter and I hope you can shed some lights
      about it.
      My apology if you're not the right person for this, and please route it
      to appropriate people.


              ------------
      - --------------------------------- ----------------
             | client | ---------------> | LoadBalancingMachine |
      - ------------ | ServerOne |
             ------------
      - --------------------------------- -----------------


      | -----------------


      - ----------------------| ServerTwo |


      - ------------------


      We try to using a product called HydraWeb that perform load balancing.
      When you connect to it, it will poll servers that it monitors (in this
      case: ServerOne and ServerTwo machines) and find out which one is
      available. It will then route caller to appropriate machine.


      The client machine does Naming.lookup using
      rmi://LoadBalancingMachine/testObject. Let say the loadbalancingmachine
      route the caller to ServerOne.
      When I print the UnicastRef object, it says ServerOne, so Naming.lookup
      and LoadBalancingMachine do their job correctly.


      Then i purposely killed rmiregistry and processes on ServerOne. On the
      ClientSide, my testClient will try to catch ConnectException and/or
      UnmarshallException. When it tries to call a remote method and get those
      exceptions, it will try to do another Naming.lookup to
      LoadBalancingMachine, which should be routed to ServerTwo (since it is
      the only one that is available now).


      The problem that I encounter is: The client need to do 'Naming.lookup'
      at least 2 times before it succeeded in getting the RemoteObject back.
      It is not the loadbalancing machine problem because as soon as ServerOne
      is killed and if we invoke a *fresh* new copy of client it will connect
      directly to ServerTwo with no problem while the original client is
      failing.


      Seems like RMI is caching the information about
      rmi://LoadBalancingMachine/testObject somewhere that causing it to get
      UnmarshallException..


      The same situation occurs even if I sleep for long time (more than 30
      seconds) before I retry the Naming.lookup


      Any ideas of how to approach this problem?


      Adobe does license Java from Sun and we do have Sun's source code.


      Thanks, Ann.


      cheers
      - -herri


      The stack of the exception:


      in testRMImain:unmarshall
      java.rmi.UnmarshalException: Error unmarshaling return header
              at
      sun.rmi.transport.StreamRemoteCall.executeCall(StreamRemoteCall.java:
      208)
              at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:104)
              at
      sun.rmi.registry.RegistryImpl_Stub.lookup(RegistryImpl_Stub.java:94)
              at java.rmi.Naming.lookup(Naming.java:60)
              at testRMI$Client.run(testRMI.java:83)
              at java.lang.Thread.run(Thread.java:474)
      Sleep for: 11000 ms



            ldorninsunw Laird Dornin (Inactive)
            ldorninsunw Laird Dornin (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated:
              Resolved:
              Imported:
              Indexed: