Uploaded image for project: 'Code Tools'
  1. Code Tools
  2. CODETOOLS-7903580

Allow for re-attempting agent creation when an attempt fails

XMLWordPrintable

    • Icon: Enhancement Enhancement
    • Resolution: Fixed
    • Icon: P4 P4
    • jtreg7.4
    • None
    • tools
    • None

      When jtreg runs an action for a test (for example, the compile action, or the main or the testng or the junit action), when running in agent VM mode, it queries a (internal) agent pool. The pool implementation is responsible for either returning a eligible Agent instance or creating a new Agent instance returning that one.

      Each instance of a Agent consists of 2 components. One component is the com.sun.javatest.regtest.exec.Agent instance which creates a java.net.ServerSocket (on an ephemeral port) and waits for certain amount of (configurable) duration to accept a connection on that port. The other component is the com.sun.javatest.regtest.agent.AgentServer. Each instance of the com.sun.javatest.regtest.exec.Agent during construction launches a separate process whose main class is the com.sun.javatest.regtest.agent.AgentServer. One of the arguments passed to this main class is the port on which the ServerSocket is waiting for a connection. The com.sun.javatest.regtest.agent.AgentServer will then establish a java.net.Socket connection to this port and once this connection is established, the com.sun.javatest.regtest.exec.Agent instance is successfully created and this is then stored in the agent pool.

      For various reasons, it can so happen that the connection establishment between the com.sun.javatest.regtest.exec.Agent and the separately launched process' com.sun.javatest.regtest.agent.AgentServer may not happen within the accept timeout. Thus the com.sun.javatest.regtest.exec.Agent instance creation fails with exception like "java.net.SocketTimeoutException: Accept timed out".

      The failure to create an instance, through the pool, propagates all the way up to the test action's execution and the action itself fails with exception like "Cannot get VM for test: java.net.SocketTimeoutException: Accept timed out".

      It has been noticed that the inability to establish such socket connection is intermittent i.e. a subsequent attempt to create a new Agent instance (on a newer port) can (and most of the times) passes. This is an enhancement request to introduce a way in jtreg to re-attempt agent creation when it fails the first time, possibly in a configurable way which would then allow disabling this feature.

            jpai Jaikiran Pai
            jpai Jaikiran Pai
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: