Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-4168586

Regression: TCP broken on SOLARIS in JDK1.2FCS-G build. Performance and hangs.

XMLWordPrintable

    • 1.2fcs
    • sparc
    • solaris_2.5.1, solaris_2.6
    • Not verified

      va.lang.ref.ReferenceQueue$Lock@EBC983A8/EBCCDB90: <unowned>
              Waiting to be notified:
                  "Finalizer" (0x669a0)
      Registered Monitor Dump:
          PCMap lock: <unowned>
          utf8 hash table: <unowned>
          JNI pinning lock: <unowned>
          JNI global reference lock: <unowned>
          BinClass lock: <unowned>
          Class linking lock: <unowned>
          System class loader lock: <unowned>
          Code rewrite lock: <unowned>
          Heap lock: <unowned>
          Monitor cache lock: owner "Signal dispatcher" (0x36800) 1 entry
          Dynamic loading lock: <unowned>
          Monitor IO lock: <unowned>
          User signal monitor: <unowned>
          Child death monitor: <unowned>
          I/O monitor: <unowned>
              Waiting to be notified:
                  "main" (0x28928)
          Alarm monitor: <unowned>
              Waiting to be notified:
                  <unknown thread> (0x2b7f8)
          Thread queue lock: owner "Signal dispatcher" (0x36800) 1 entry
          Monitor registry: owner "Signal dispatcher" (0x36800) 1 entry

       
      I am seeing the following 3 problems with TCP on SOLARIS 2.6 using
      JDK1.2FCS-G when running a TCP server from SOLARIS 2.6 and running
      TCP clients from either SOLARIS 2.6 or WIN32.

      These problems are SOLARIS specific. The problems occurs only on SOLARIS
      2.6 with and without the JIT. The problems seem to manifest themselves only
      with GREEN threads.

      I added more info on these bugs in the attachment section. I added a file called
      tcpbug-moreinfo which has more stack traces in it of these problems.

      (1) Problem #1 is that when you start up a tcpServer on SOLARIS 2.6 it
          takes about 20-30 seconds to create a ServerSocket(). This should
          be instantaneous. This is a serious performance problem. When you
          start up the TCP server it blocks in the call to the ServerSocket()
          constructor for about 20-30 seconds.

          This does not happen on the JDK1.2FCS-F build.

      (2) Problem #2 is that it takes a long time to connect to the tcpServer
          from a tcpClient when running the tcpClient on SOLARIS 2.6. Connection
          is instantaneous on WIN32. Something changed on SOLARIS 2.6 to make
          connecting slow. It takes about 20-30 seconds to connect to the
          tcpServer. This is another serious performance problem. When you
          start up the TCP client it blocks in the call to the Socket()
          constructor for about 20-30 seconds.

          This does not happen on the JDK1.2FCS-F build.

      (3) Problem #3 is that I can consistantly get the tcpServer running on
          SOLARIS 2.6 to hang when running a number of clients against it.
          This is very bad.

          This doe not happen on the JDK1.2FCS-F build.

      ----------
      In SUMMARY
      ----------
      The constructor to create a ServerSocket() hangs for about 20-30 seconds
              on SOLARIS 2.6.

      The constructor to create a Socket() and connect hangs for about 20-30
              seconds on SOLARIS 2.6.

      The tcpServer hangs when running a number of tcpClients.

      Below is a stack trace of the hung tcpServer. From the stack trace it
      looks like the problem is in the SocketRead() native code.

      As part of the attachment I have the TCP Server and TCP Client test
      code. Look at the README file for more info. To extract anfd build type:

      % tar xvf tcp.tar
      % cd tcp
      % make

      To reproduce the above problems and the hang type:

      Start the TCP Server in one window:

      server_machine> java tcpServer -d -v -l 0

      Start the TCP Clients in the other windows:

      client1_machine> run_tcp_client server_machine
      client2_machine> run_tcp_client server_machine
      client3_machine> run_tcp_client server_machine


      You need to start several TCP clients to reproduce the hang. You can start
      TCP clients from SOLARIS and WIN32.

      DEBUG: read message #9 from aefpc/129.148.27.249:4167
      VERBOSE: Connection to client 129.148.27.249:4167 closed.
      DEBUG: read message #5 from highlanders/129.148.27.231:55430
      DEBUG: read message #1 from localhost/127.0.0.1:35056
      DEBUG: read message #2 from localhost/127.0.0.1:35056
      DEBUG: SOCKET INFO
      DEBUG: -----------
      DEBUG: getLocalPort() = 25000
      DEBUG: getPort() = 4175
      DEBUG: getSoLinger() = -1
      DEBUG: getSoTimeout() = 0
      DEBUG: getTcpNoDelay() = false
      ^\SIGQUIT

      Full thread dump Classic VM (JDK-1.2fcs-G, green threads):
          "Thread-53" (TID:0xebcaaac0, sys_thread_t:0x1a29e8, state:R) prio=5
              at java.net.SocketInputStream.socketRead(Native Method)
              at java.net.SocketInputStream.read(Compiled Code)
              at tcpServer$ProcessClient.run(Compiled Code)
          "Thread-52" (TID:0xebcac9a0, sys_thread_t:0x1a1fa8, state:R) prio=5
              at java.net.SocketInputStream.socketRead(Native Method)
              at java.net.SocketInputStream.read(Compiled Code)
              at tcpServer$ProcessClient.run(Compiled Code)
          "Thread-51" (TID:0xebcadf18, sys_thread_t:0x1a2208, state:R) prio=5
              at java.net.SocketInputStream.socketRead(Native Method)
              at java.net.SocketInputStream.read(Compiled Code)
              at tcpServer$ProcessClient.run(Compiled Code)
          "Thread-1" (TID:0xebca2770, sys_thread_t:0x191518, state:CW) prio=5
              at java.lang.Object.wait(Native Method)
              at java.lang.Object.wait(Compiled Code)
              at tcpServer$ClientManager.run(Compiled Code)
          "Thread-0" (TID:0xebca1140, sys_thread_t:0x192f18, state:R) prio=5 *current thread*
              at java.net.InetAddressImpl.getHostByAddr(Native Method)
              at java.net.InetAddress.getHostName(Compiled Code)
              at java.net.InetAddress.getHostName(Compiled Code)
              at java.net.InetAddress.toString(Compiled Code)
              at java.lang.String.valueOf(Compiled Code)
              at java.lang.StringBuffer.append(Compiled Code)
              at tcpServer$Listener.DumpSocketInfo(Compiled Code)
              at tcpServer$Listener.run(Compiled Code)
          "Finalizer" (TID:0xebc98390, sys_thread_t:0x669a0, state:CW) prio=8
              at java.lang.Object.wait(Native Method)
              at java.lang.ref.ReferenceQueue.remove(Compiled Code)
              at java.lang.ref.ReferenceQueue.remove(Compiled Code)
              at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:174)
          "Reference Handler" (TID:0xebc98420, sys_thread_t:0x640f8, state:CW) prio=10
              at java.lang.Object.wait(Native Method)
              at java.lang.Object.wait(Compiled Code)
              at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:209)
          "Signal dispatcher" (TID:0xebc98190, sys_thread_t:0x36800, state:R) prio=5
          "main" (TID:0xebc98250, sys_thread_t:0x28928, state:CW) prio=5
              at java.lang.Object.wait(Native Method)
              at java.lang.Thread.join(Compiled Code)
              at java.lang.Thread.join(Compiled Code)
              at tcpServer.<init>(Compiled Code)
              at tcpServer.main(Compiled Code)
      Monitor Cache Dump:
          java.lang.ref.Reference$Lock@EBC98430/EBCCD688: <unowned>
              Waiting to be notified:
                  "Reference Handler" (0x640f8)
          java.lang.StringBuffer@EBCA9710/EBD833A0: owner "Thread-0" (0x192f18) 1 entry
          tcpServer$ClientManager@EBCA2770/EBD15B18: <unowned>
              Waiting to be notified:
                  "Thread-1" (0x191518)
          tcpServer$Listener@EBCA1140/EBD103B8: <unowned>
              Waiting to be notified:
                  "main" (0x28928)
          java.lang.ref.ReferenceQueue$Lock@EBC983A8/EBCCDB90: <unowned>
              Waiting to be notified:
                  "Finalizer" (0x669a0)
      Registered Monitor Dump:
          PCMap lock: <unowned>
          utf8 hash table: <unowned>
          JNI pinning lock: <unowned>
          JNI global reference lock: <unowned>
          BinClass lock: <unowned>
          Class linking lock: <unowned>
          System class loader lock: <unowned>
          Code rewrite lock: <unowned>
          Heap lock: <unowned>
          Monitor cache lock: owner "Signal dispatcher" (0x36800) 1 entry
          Dynamic loading lock: <unowned>
          Monitor IO lock: <unowned>
          User signal monitor: <unowned>
          Child death monitor: <unowned>
          I/O monitor: <unowned>
          Alarm monitor: <unowned>
              Waiting to be notified:
                  <unknown thread> (0x2b7f8)
          Thread queue lock: owner "Signal dispatcher" (0x36800) 1 entry
          Monitor registry: owner "Signal dispatcher" (0x36800) 1 entry


      It seems like I am seeing hangs when running some URL tests as well. Again the
      culprit look like SocketRead().

      We are also getting failures from RMI due to these TCP problems. Here is some stack traces of RMI problems.

      MyObject's not garbage collected yet on DGCTestServer = 819

      Creating 300 MyObject's of size 2K one at a time on DGCTestServer
      ERROR: DGCTestClient.doPhase1Tests(): exception occurredConnection refused to ho
      st: 129.148.27.228; nested exception is:
              java.net.ConnectException: Connection refused
      java.rmi.ConnectException: Connection refused to host: 129.148.27.228; nested ex
      ception is:
              java.net.ConnectException: Connection refused
      java.net.ConnectException: Connection refused
              at java.net.PlainSocketImpl.socketConnect(Native Method)
              at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:305)
              at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:125)
              at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:112)
              at java.net.Socket.<init>(Socket.java:226)
              at java.net.Socket.<init>(Socket.java:95)
              at sun.rmi.transport.proxy.RMIDirectSocketFactory.createSocket(RMIDirect
      SocketFactory.java:29)
              at sun.rmi.transport.proxy.RMIMasterSocketFactory.createSocket(RMIMaster
      SocketFactory.java:124)
              at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:462)
              at sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:194
      )
              at sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:178)
              at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:87)
              at dgc.MyObjectFactoryImpl_Stub.createMyObject(MyObjectFactoryImpl_Stub.
      java)
              at dgc.DGCTestClient.doPhase1Tests(DGCTestClient.java:731)
              at dgc.DGCTestClient.run(DGCTestClient.java:272)
              at java.lang.Thread.run(Thread.java:475)
      ERROR: DGCTestClient.doPhase1Tests(): exception occurredConnection refused to ho
      st: 129.148.27.228; nested exception is:
              java.net.ConnectException: Connection refused
      java.rmi.ConnectException: Connection refused to host: 129.148.27.228; nested exception is:
              java.net.ConnectException: Connection refused
      java.net.ConnectException: Connection refused
              at java.net.PlainSocketImpl.socketConnect(Native Method)
              at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:305)
              at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:125)
              at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:112)
              at java.net.Socket.<init>(Socket.java:226)
              at java.net.Socket.<init>(Socket.java:95)
              at sun.rmi.transport.proxy.RMIDirectSocketFactory.createSocket(RMIDirect
      SocketFactory.java:29)
              at sun.rmi.transport.proxy.RMIMasterSocketFactory.createSocket(RMIMaster
      SocketFactory.java:124)
              at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:462)
              at sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:194
      )
              at sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:178)
              at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:87)
              at dgc.MyObjectFactoryImpl_Stub.createMyObject(MyObjectFactoryImpl_Stub.
      java)
              at dgc.DGCTestClient.doPhase1Tests(DGCTestClient.java:731)
              at dgc.DGCTestClient.run(DGCTestClient.java:272)
              at java.lang.Thread.run(Thread.java:475)

      Below is the stack trace where it hangs for about 20-30 seconds after calling
      the ServerSocket() constructor from the tcpServer program.

      lobo 41 =>java tcpServer -d -v -p 30000 -l 0

      VERBOSE: port number is = 30000
      VERBOSE: loop count is = 0
      VERBOSE: allocating and starting up the Listener
      VERBOSE: allocating ServerSocket on port = 30000
      ^\SIGQUIT

      Full thread dump Classic VM (JDK-1.2fcs-G, green threads):
          "Finalizer" (TID:0xebc98390, sys_thread_t:0x669a0, state:CW) prio=8
              at java.lang.Object.wait(Native Method)
              at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:113)
              at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:128)
              at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:174)
          "Reference Handler" (TID:0xebc98420, sys_thread_t:0x640f8, state:CW) prio=10
              at java.lang.Object.wait(Native Method)
              at java.lang.Object.wait(Object.java:303)
              at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:209)
          "Signal dispatcher" (TID:0xebc98190, sys_thread_t:0x36800, state:R) prio=5
          "main" (TID:0xebc98250, sys_thread_t:0x28928, state:CW) prio=5
              at java.net.PlainSocketImpl.initProto(Native Method)
              at java.net.PlainSocketImpl.<clinit>(PlainSocketImpl.java:61)
              at java.net.ServerSocket.<init>(Compiled Code)
              at java.net.ServerSocket.<init>(Compiled Code)
              at java.net.ServerSocket.<init>(Compiled Code)
              at tcpServer$Listener.<init>(Compiled Code)
              at tcpServer.<init>(Compiled Code)
              at tcpServer.main(Compiled Code)
      Monitor Cache Dump:
          java.lang.ref.Reference$Lock@EBC98430/EBCCD688: <unowned>
              Waiting to be notified:
                  "Reference Handler" (0x640f8)
          java.lang.ref.ReferenceQueue$Lock@EBC983A8/EBCCDB90: <unowned>
              Waiting to be notified:
                  "Finalizer" (0x669a0)
      Registered Monitor Dump:
          PCMap lock: <unowned>
          utf8 hash table: <unowned>
          JNI pinning lock: <unowned>
          JNI global reference lock: <unowned>
          BinClass lock: <unowned>
          Class linking lock: <unowned>
          System class loader lock: <unowned>
          Code rewrite lock: <unowned>
          Heap lock: <unowned>
          Monitor cache lock: owner "Signal dispatcher" (0x36800) 1 entry
          Dynamic loading lock: <unowned>
          Monitor IO lock: <unowned>
          User signal monitor: <unowned>
          Child death monitor: <unowned>
          I/O monitor: <unowned>
              Waiting to be notified:
                  "main" (0x28928)
          Alarm monitor: <unowned>
              Waiting to be notified:
                  <unknown thread> (0x2b7f8)
          Thread queue lock: owner "Signal dispatcher" (0x36800) 1 entry
          Monitor registry: owner "Signal dispatcher" (0x36800) 1 entry

      ^C

      Below is the stack trace where it hangs for about 20-30 seconds after calling
      the Socket() constructor to connect to the tcpServer program from the tcpClient
      program.

      lobo 42 =>java tcpClient -d -v -c -m 30 -l 2 -b 16

      VERBOSE: port number is = 25000
      VERBOSE: buffer size is = 16K
      VERBOSE: number of messages = 30
      VERBOSE: server name = localhost
      VERBOSE: data comparison = true
      VERBOSE: random buffer sizes = false
      VERBOSE: loop count is = 2
      DTI_DoneInitializing
      VERBOSE: connect to server localhost at port 25000
      ^\SIGQUIT

      Full thread dump Classic VM (JDK-1.2fcs-G, green threads):
          "Finalizer" (TID:0xebc98390, sys_thread_t:0x669a0, state:CW) prio=8
              at java.lang.Object.wait(Native Method)
              at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:113)
              at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:128)
              at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:174)
          "Reference Handler" (TID:0xebc98420, sys_thread_t:0x640f8, state:CW) prio=10
              at java.lang.Object.wait(Native Method)
              at java.lang.Object.wait(Object.java:303)
              at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:209)
          "Signal dispatcher" (TID:0xebc98190, sys_thread_t:0x36800, state:R) prio=5
          "main" (TID:0xebc98250, sys_thread_t:0x28928, state:CW) prio=5
              at java.net.PlainSocketImpl.initProto(Native Method)
              at java.net.PlainSocketImpl.<clinit>(PlainSocketImpl.java:61)
              at java.net.Socket.<init>(Compiled Code)
              at java.net.Socket.<init>(Compiled Code)
              at java.net.Socket.<init>(Compiled Code)
              at tcpClient.<init>(Compiled Code)
              at tcpClient.main(Compiled Code)
      Monitor Cache Dump:
          java.lang.ref.Reference$Lock@EBC98430/EBCCD688: <unowned>
              Waiting to be notified:
                  "Reference Handler" (0x640f8)
          ja

            never Tom Rodriguez
            aefreche Alan Frechette (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved:
              Imported:
              Indexed: