Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-4072213

Memory leak in RMI java code

XMLWordPrintable

    • 1.1.5
    • other
    • generic
    • Not verified



        Name: rs12567 Date: 08/15/97


        IBM HOT LIST BUG #3
        -
        Reported to IBM JTC by San Francisco Project.
        -
        There's a memory leak in the RMI java code, which is quite serious,
        because it will eventually make a long-running RMI server fall over
        with an OutOfMemoryError.
        -
        The problem is with the hashtable used in sun.rmi.transport.DGCAckHandler.
        We have a testcase which was created to test the scalability of a San
        Francisco server. The purpose of the testcase was to ensure that we
        were freeing up all the objects that were being used so they could be
        garbage collected. The testcase has a client which simply loops creating
        Entities (which are RMI remote objects) on the San Francisco server, and
        then releases the Entity i.e. no longer references it.
        -
        The heap size was set to 1 meg to reduce the amount of memory in the San
        Francisco server. We found we could get on average about 1200 of these
        objects created before we ran out of memory.
        -
        After digging into this, we found that the hashtable in DGCAckHandler
        continued to grow but never came back down. In my understanding, this
        hashtable is used to save the AckHandler until the ack comes back from
        the client. When the ack comes from the client, the entry is removed
        from the hashtable. It appeared that the client was never sending the
        ack. After further investigation, however, the client was sending the
        ack, but the ack was handled before the ackhandler was put into the
        hashtable (On NT, it even made a difference which window had the focus,
        the client or the server). Since I have access to the source, I put a
        "hack" into DGCAckHandler to remove the entries in the hashtable if the
        size became > 5. With this hack, we were able to create 12000 objects
        vs. the 1200. Since once the ack is received, this entry is no longer
        needed, it is, in a sense, leaking memory. This will really cause
        problems for a busy San Francisco server.
        -
        -
        HOW TO REPRODUCE THIS PROBLEM
        -
        If you take the 'Hello, World' example from the JDK documentation in
        the docs/guide/rmi/examples/hello directory, and modify it like this:
        -
        Change Hello.java so it returns Hello instead of String:
        ------
        package examples.hello;

        public interface Hello extends java.rmi.Remote {
            Hello sayHello() throws java.rmi.RemoteException;
        }
        ------
        Change the 'sayHello()' method in HelloImpl.java so it looks like
        this:
        ------
            public Hello sayHello() throws RemoteException {
                return new HelloImpl("Child");
            }
        ------
        Completely change the HelloApplet to this:
        ------
        package examples.hello;

        import java.awt.*;
        import java.rmi.*;

        public class HelloApplet extends java.applet.Applet implements Runnable {
            public void init()
            {
                new Thread(this).start();
            }
            public void run()
            {
                try {
                    Hello obj = (Hello) Naming.lookup("//" + getCodeBase().getHost() + "/HelloServer");
                    while (true) {
                        Hello hello = obj.sayHello();
                        System.out.println("got hello");
                    }
                }
                catch (Exception e) {e.printStackTrace();}
            }
        }
        ------
        In the run script, change the line that loads the server to this,
        so that there is only 1MB of heap space:
        ------
        run java -ms1M -mx1M -Djava.rmi.server.codebase="$codebase_url" $server &
        ------
        Now we have to make some changes to the JDK classes (bear with me).
        In src/share/sun/sun/rmi/transport/DGCAckHandler.java, add this println
        to the end of the 'received(UID id)' method:
        ------
            public static void received(UID id)
            {
                DGCAckHandler entry = (DGCAckHandler)objLists.remove(id);
                if (entry != null) {
                    entry.removeNotifiable();
                }
                System.out.println("Hash table entries = "+objLists.size());
            }
        ------
        You can run the modified Hello example now, and you will most likely
        see 'Hash table entries = 0' appearing repeatedly on the screen.
        Because the bug is timing-dependent, we need to add a delay, otherwise
        it doesn't happen often enough for the purposes of a test case.
        In src/share/sun/sun/rmi/transport/ConnectionOutputStream.java, add a
        delay to the 'done(Connection c)' method like this:
        ------
            void done(Connection c) {
                if ((objList != null) && !objList.isEmpty()) {
                    try {Thread.sleep(200);} catch (InterruptedException e) {}
                    new DGCAckHandler(c.getChannel(), ackID, objList);
                }
            }
        ------
        This gives the server enough time to get the ACK back from the
        client before the DGCAckHandler is added to DGCAckHandler's objLists
        hash table.
        Now when you run the example, you should see the number after
        'Hash table entries = ' increasing each time around the loop.
        If it doesn't, then increase the delay.
        This shows that objLists is filling up.
        If you leave the test case to run for a few minutes, then eventually
        you will get an OutOfMemoryException (at 'Hash tables entries = 1720'
        or so).
        -
        -
        WHAT IS THE CAUSE OF THE PROBLEM?
        -
        The problem is that before the 'new DGCAckHandler(..)' is issued in
        sun.rmi.transport.ConnectionOutputStream.done(Connection c), it is
        possible that the ACK has already come back from the client.
        This is handled in sun.rmi.transport.DGCAckHandler.received(UID id),
        which is called from sun.rmi.transport.tcp.TCPTransport.handleMessages(..).
        The received(..) method will do nothing if the hash table doesn't
        contain the DGCAckHandler object yet. When the DGCAckHandler object
        is subsequently added to the hash table, it will never be removed.
        Repeated RMI calls will eventually use up all the memory.
        -
        -
        SUGGESTED FIX
        -
        In src/share/sun/sun/rmi/transport/StreamRemoteCall.java, delete the
        line that says 'out.done(conn);' from the 'releaseOutputStream()'
        method, and add it to 'getResultStream(boolean success') just
        before the call to 'out.writeID()'. The code should look like this:
        -
            public ObjectOutput getResultStream(boolean success)
                throws ...
            ...
                    if (success) //
                        out.writeByte(TransportConstants.NormalReturn);
                    else
                        out.writeByte(TransportConstants.ExceptionalReturn);
                    out.done(conn); // <-- added
                    out.writeID(); // write id for gcAck
                    return out;
                }
            ...
            public void releaseOutputStream() throws IOException
            {
                try {
                    if (out != null) {
                        out.flush();
                        /* out.done(conn); <-- deleted */
                    }
                    conn.releaseOutputStream();
                }
                finally {
                    out = null;
                }
            }
        -
        This change makes it add the DGCAckHandler object to the hash
        table before the ID is sent to the client, so there is no
        chance of the ACK coming back before the DGCAckHandler is added.
        I have not been through ALL the implications of this change, but
        I have looked at most of them, and I think this change is along
        the right track.
        If you apply the above change and re-run the example, then the
        hash table size doesn't increase, and you get 'Hash table
        entries = 0' each time round the loop.

        ======================================================================

         1997-10-20
        Description from another customer:

        I'm working on a telecommunications project that has four JVMs presently, but
        not necessrily running on the same Sparc5. Of these four, three leak memory
        (at the rate of 4-8 kb per minute) even when they are not being exercised,
        i.e. they are just sitting and waiting for input, whether from the user, or
        another process. (I used vmstat to monitor usage.) The ones that leak are
        either RMI servers or the user interface. We'd like to keep the servers
        running as long as possible.

        The one that doesn't leak uses neither AWT components or RMI.

        I've also noticed the bytes free and bytes total reported by java's verbosegc
        option and by calls to RunTime seem to be different. Aren't they the same
        heap?

              duke J. Duke
              rschiavisunw Richard Schiavi (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated:
                Resolved:
                Imported:
                Indexed: