Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-4228651

Distributed gc threads hang

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Duplicate
    • Icon: P4 P4
    • None
    • 1.1.7
    • core-libs
    • x86
    • windows_nt

      File descriptors for sockets seem to be propogated
      by Runtime.exec() calls such that if the parent process
      terminates, the child will continue to hold open it's OS file
      descriptors (visible in netstat command output). Since the
      child has no thread listening on the socket any attempts to
      connect to it succeed from a client's POV, but appear to be
      piped to the bit bucket. This hangs RMI's DGC threads, as well
      as any thread attempting to communicate with any remote objects
      from the, now deceased, parent process. (see below for our
      initial diagnosis.)


      The customer has a process which spawns multiple child processes
      then exits. The inherited descriptors result in the DGC
      threads on all servers that have interacted with the parent
      process to hang, as well as our internal client termination
      thread. The deadlock of these threads (esp. the DGC's LeaseRenewer)
      seems to prevent any additional RMI activity from taking
      place in the process, further deadlocking the process and
      the customer's entire application. (we suspect this has something
      to do with a lock on the object table held by the LeaseRenwer thread.)
      This problem does not exist on Solaris.


      ====================================================
      Here is an example of the hung process with jdk 1.1.6
      ====================================================
      Full thread dump:
          "Cleaner" (TID:0xf66090, sys_thread_t:0x8aa8d0, Win32ID:0x125,state:CW) prio=5
              sun.rmi.transport.DGCClient.run(Compiled Code)
              java.lang.Thread.run(Thread.java:474)
          "LeaseRenewer" (TID:0xf660a8, sys_thread_t:0x8aa2e0, Win32ID:0x78,state:R) prio=5
              java.net.SocketInputStream.read(Compiled Code)
              java.io.BufferedInputStream.fill(Compiled Code)
              java.io.BufferedInputStream.read(Compiled Code)
              java.io.DataInputStream.readByte(Compiled Code)
              sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:147)
              sun.rmi.server.UnicastRef.newCall(UnicastRef.java:73)
              sun.rmi.transport.DGCImpl_Stub.dirty(DGCImpl_Stub.java:55)
              sun.rmi.transport.DGCClient.renewLeases(Compiled Code)
              sun.rmi.transport.DGCClient.doRenewal(Compiled Code)
              sun.rmi.transport.DGCClient$LeaseRenewer.run(Compiled Code)
              java.lang.Thread.run(Thread.java:474)
          "Reaper" (TID:0xf660e0, sys_thread_t:0x8aa200, Win32ID:0x133, state:CW) prio=5
              sun.rmi.transport.Reaper.run(Compiled Code)
              java.lang.Thread.run(Thread.java:474)
          "Finalizer thread" (TID:0xf60088, sys_thread_t:0x89d560, Win32ID:0x93,state:CW) prio=2
          "main" (TID:0xf600b0, sys_thread_t:0x89a7d0, Win32ID:0x132, state:R) prio=5
              java.net.SocketInputStream.read(Compiled Code)
              java.io.BufferedInputStream.fill(Compiled Code)
              java.io.BufferedInputStream.read(Compiled Code)
              java.io.DataInputStream.readByte(Compiled Code)
              sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:147)
              sun.rmi.server.UnicastRef.newCall(UnicastRef.java:73)
              RMIWidget_Stub.pingMe(RMIWidget_Stub.java:27)
              hanger.main(Compiled Code)
      Monitor Cache Dump:
          <unknown key> (0x8aa200): <unowned>
              Waiters: 1
          java.io.BufferedInputStream@F6B8C8/FEACD0: owner "main" (0x89a7d0, 1 entry)
          java.io.BufferedInputStream@F6BC80/FEC170: owner "LeaseRenewer" (0x8aa2e0, 1 entry)
          java.lang.Object@F65E78/FBC208: owner "LeaseRenewer" (0x8aa2e0, 1 entry)
          java.util.Vector@F65E58/FBC258: <unowned>
              Waiters: 1
      Registered Monitor Dump:
          SymcJIT Method Monitor: <unowned>
          SymcJIT Method Monitor: <unowned>
          SymcJIT Method List Monitor: <unowned>
          SymcJIT Fixups Allocation: <unowned>
          SymcJIT Code Allocation: <unowned>
          SymcJIT Data Allocation: <unowned>
          Thread queue lock: <unowned>
          Name and type hash table lock: <unowned>
          String intern lock: <unowned>
          JNI pinning lock: <unowned>
          JNI global reference lock: <unowned>
          BinClass lock: <unowned>
          Class loading lock: <unowned>
          Java stack lock: <unowned>
          Code rewrite lock: <unowned>
          Heap lock: <unowned>
          Has finalization queue lock: <unowned>
          Finalize me queue lock: <unowned>
              Waiters: 1
          Monitor registry: <unowned>



      ====================================================
      COMPILATION INSTRUCTIONS
      ====================================================
      "javac" Compile all java files
      "rmic" Compile RMIWidget



      ====================================================
      TESTCASE INSTRUCTIONS
      ====================================================
      how to run this test case:
      (note: all lines output by the test case are marked with => herein to
      distinguish them...
      this flag will not really display, it's just to get around the non-marked
      up nature of a
      text file :)


      unzip the testcase to a directory, for reference I'll call it d:\9310

      open three command prompts to d:\9310
      run the rmiregistry in one window, this window may be minimized and
      ignored...

      run "java parent yes" in another window, you should see in this "parent"
      window...
      =>RMIWidget::RMIWidget(0) begin
      =>RMIWidget::RMIWidget(0) end
      this indicates that the rmiwidget has been successfully exported and
      registered with
      naming.

      now run "java hanger" in the third, it will say:
      =>Ping begining

      then the parent will say:
      =>RMIWidget-0 has been pinged...

      then the hanger will say:
      =>Ping completed
      and then start to draw this line:
      =>|----|----|----.
      one character per second...

      after the fifteen seconds are up "hanger" will repeat this (endless) loop
      of pinging
      the object exported from parent. You will see that after some time the
      parent will
      exec the child window and which is followed by a short nap:

      =>exec'ing...
      =>naping...

      you will see the new window start up and just sit there... it shouldn't
      report
      anything, it's only point in life is to inherit the file descriptors used
      by the
      parent to export an object, and thereby display the bug. at this point the
      parent
      is done and it will report that:

      =>done.

      I've put in then another delay before the parent exits so that we may see
      the hanger
      thread continue to ping the object on the parent process while both parent
      and child
      are up. After this delay the parent will exit and return to a command
      prompt. At
      this point the Hanger will finish the current 15 second delay it's on then
      report
      that it is begining a new ping. this is as far as it will ever get. the
      last thing
      it says is:

      =>Ping begining

      taking a thread break of the hanger at this point will show that the main
      thread is
      now deadlocked, let it sit a while and the DGC will wake up and attempt to
      renew
      leases, at this point a thread break will show that it is deadlocked....



      ====================================================
      SOURCE CODE:
      ====================================================



      ====================================================
      RMIWidgetInterface.java
      ====================================================
      public interface RMIWidgetInterface extends java.rmi.Remote {
         public void pingMe() throws java.rmi.RemoteException;
      }

      ====================================================
      RMIWidget.java
      ====================================================
      public class RMIWidget
      extends java.rmi.server.UnicastRemoteObject
      implements RMIWidgetInterface {
         private static final boolean trace = true;
         int number;
         public RMIWidget(int p) throws java.rmi.RemoteException {
            super();
            if (trace) System.out.println("RMIWidget::RMIWidget("+p+") begin");
            number = p;
            try {
               java.rmi.Naming.bind("RMIWidget-"+p, this);
            } catch (java.rmi.AlreadyBoundException abe) {
               try {
                  java.rmi.Naming.rebind("RMIWidget-"+p, this);
               } catch (java.rmi.AccessException ae) {
                  System.out.println("-=RESET THE REGISTRY=-");
               } catch (java.net.MalformedURLException mue) {
                  mue.printStackTrace();
               }
            } catch (java.net.MalformedURLException mue) {
               mue.printStackTrace();
            }
            if (trace) System.out.println("RMIWidget::RMIWidget("+p+") end");
         }
         public void pingMe() throws java.rmi.RemoteException {
            if (trace) System.out.println("RMIWidget-"+this.number+" has been
      pinged...");
         }
      }

      ====================================================
      parent.java
      ====================================================
      public class parent {
         public static void main (String[] args) {
            if (args.length != 1) {
               System.out.println("usage: java parent [yes|no]");
               System.exit(0);
            }
            boolean exit = (args[0].toLowerCase().charAt(0) == 'y');
            try {
               RMIWidgetInterface target = new RMIWidget(0);
               Thread.sleep(120*1000); //2 minutes to start hanger, get some
      pings done
               System.out.println("exec'ing...");
                //2 hours should be enough ;)
      // Runtime.getRuntime().exec("java child 120"); //generic
               Runtime.getRuntime().exec("cmd /c start java child 120"); //NT so
      we get a window
               System.out.println("naping...");
               Thread.sleep(20*1000); //some platforms (solaris) need this for
      the exec to work
               System.out.println("done.");
               if (exit) {
                  Thread.sleep(120*1000); //2 minutes to verify hanger still up
      and good
                  System.out.println("exiting");
                  System.exit(0);
               }
            } catch (Exception e) {
               e.printStackTrace();
            }
         }
      }

      ====================================================
      child.java
      ====================================================
      public class child {
         public static void main(String[] args) {
            try {
               Thread.sleep(Integer.parseInt(args[0])*60*1000);
            } catch (InterruptedException ie) {
               ie.printStackTrace();
            }
         }
      }

      ====================================================
      hanger.java
      ====================================================
      public class hanger {
         public static void main (String[] args) {
            try {
               RMIWidgetInterface target = (RMIWidgetInterface)
      java.rmi.Naming.lookup("RMIWidget-0");
               while (true) {
                  System.out.println("Ping begining");
                  target.pingMe();
                  System.out.println("Ping completed");
                  for (int lcv = 15; lcv > 0; --lcv) {
                     Thread.sleep(1000);
                     System.out.print((lcv%5==0)?('|'):('-'));
                  }
                  System.out.println('.');
               }
            } catch (Exception e) {
               e.printStackTrace();
            }
         }
      }

      ---------------------------------------------

      Here is a thread dump from 1.2.2 V build hung process:

      Full thread dump Classic VM (JDK-1.2-V, native threads):
          "GC Daemon" (TID:0x18f3230, sys_thread_t:0x859ec0, state:CW, native ID:0xd6) prio=2
              at java.lang.Object.wait(Native Method)
              at sun.misc.GC$Daemon.run(Compiled Code)
          "RMI RenewClean-[129.144.171.109:1102]" (TID:0x18f2fe8, sys_thread_t:0x857ae0, state:R, native ID:0xb8) prio=5
              at java.net.SocketInputStream.socketRead(Native Method)
              at java.net.SocketInputStream.read(Compiled Code)
              at java.io.BufferedInputStream.fill(Compiled Code)
              at java.io.BufferedInputStream.read(Compiled Code)
              at java.io.DataInputStream.readByte(Compiled Code)
              at sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:210)
              at sun.rmi.transport.tcp.TCPChannel.newConnection(Compiled Code)
              at sun.rmi.server.UnicastRef.newCall(UnicastRef.java:306)
              at sun.rmi.transport.DGCImpl_Stub.dirty(Unknown Source)
              at sun.rmi.transport.DGCClient$EndpointEntry.makeDirtyCall(Compiled Code)
              at sun.rmi.transport.DGCClient$EndpointEntry$RenewCleanThread.run(Compiled Code)
              at java.lang.Thread.run(Thread.java:479)
          "Finalizer" (TID:0x18e9320, sys_thread_t:0x81b460, state:CW, native ID:0xc4) prio=8
              at java.lang.Object.wait(Native Method)
              at java.lang.ref.ReferenceQueue.remove(Compiled Code)
              at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:127)
              at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:174)
          "Reference Handler" (TID:0x18e93b0, sys_thread_t:0x819390, state:CW, native
      ID:0xc2) prio=10
              at java.lang.Object.wait(Native Method)
              at java.lang.Object.wait(Object.java:424)
              at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:114)
          "Signal dispatcher" (TID:0x18e93e0, sys_thread_t:0x819770, state:R, native ID:0xbf) prio=5
          "main" (TID:0x18e91e0, sys_thread_t:0x7d0c00, state:R, native ID:0xcc) prio=5
              at java.net.SocketInputStream.socketRead(Native Method)
              at java.net.SocketInputStream.read(Compiled Code)
              at java.io.BufferedInputStream.fill(Compiled Code)
              at java.io.BufferedInputStream.read(Compiled Code)
              at sun.rmi.transport.tcp.TCPConnection.isDead(TCPConnection.java:173)
              at sun.rmi.transport.tcp.TCPChannel.newConnection(Compiled Code)
              at sun.rmi.server.UnicastRef.newCall(UnicastRef.java:306)
              at RMIWidget_Stub.pingMe(RMIWidget_Stub.java:29)
              at hanger.main(Compiled Code)
      Monitor Cache Dump:
          java.lang.ref.ReferenceQueue$Lock@18E9338/191ED60: <unowned>
              Waiting to be notified:
                  "Finalizer" (0x81b460)
          java.lang.ref.Reference$Lock@18E93C0/191E890: <unowned>
              Waiting to be notified:
                  "Reference Handler" (0x819390)
          sun.misc.GC$LatencyLock@18F2F90/1950550: <unowned>
              Waiting to be notified:
                  "GC Daemon" (0x859ec0)
          java.io.BufferedInputStream@18F3578/1951430: owner "main" (0x7d0c00) 1 entry
          java.io.BufferedInputStream@18F5378/19540D0: owner "RMI RenewClean-[129.144.171.109:1102]" (0x857ae0) 1 entry
      Registered Monitor Dump:
          SymcJIT Method Monitor: <unowned>
          SymcJIT Method Monitor: <unowned>
          SymcJIT Method List Monitor: <unowned>
          SymcJIT Lock: <unowned>
          utf8 hash table: <unowned>
          JNI pinning lock: <unowned>
          JNI global reference lock: <unowned>
          BinClass lock: <unowned>
          Class linking lock: <unowned>
          System class loader lock: <unowned>
          Code rewrite lock: <unowned>
          Heap lock: <unowned>
          Monitor cache lock: owner "Signal dispatcher" (0x819770) 1 entry
          Thread queue lock: owner "Signal dispatcher" (0x819770) 1 entry
          Monitor registry: owner "Signal dispatcher" (0x819770) 1 entry


          SymcJIT Method List Monitor: <unowned>
          SymcJIT Lock: <unowned>
          utf8 hash table: <unowned>
          JNI pinning lock: <unowned>
          JNI global reference lock: <unowned>
          BinClass lock: <unowned>
          Class linking lock: <unowned>
          System class loader lock: <unowned>
          Code rewrite lock: <unowned>
          Heap lock: <unowned>
          Monitor cache lock: owner "Signal dispatcher" (0x819770) 1 entry
          Thread queue lock: owner "Signal dispatcher" (0x819770) 1 entry
          Monitor registry: owner "Signal dispatcher" (0x819770) 1 entry



            pongsunw Patrick Ong (Inactive)
            miflemi Mick Fleming
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated:
              Resolved:
              Imported:
              Indexed: