Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-5086160

Request for improvements to javax.naming.directory

XMLWordPrintable

    • Icon: Enhancement Enhancement
    • Resolution: Fixed
    • Icon: P4 P4
    • 6
    • 6
    • core-libs
    • None
    • b40
    • sparc
    • solaris_9

        Jacobs Rimell is providing a solution for Comcast.
        They've run into some issues and want us to be aware of them and have made some
        suggestions for some changes that they feel would enhance the interface.

        Background Info
        ================

        Jacobs Rimell (JR) have a product called APS. The current version (APS4) is
        a J2EE application that requires LDAP access to multiple LDAP directories
        from within an app-server.

        At Comcast there is only one Directory (the GDS), however some of this email
        reflects APS requirements as well as Comcast's requirements on APS. Given
        that Comcast contacted you originally I have deliberately indicated where
        the issue /current implementation has a direct impact on the customer.

        JR Issues with LDAP connection pooling
        ======================================

        Originally we tried to use the inbuilt connection pooling in recent releases
        of Sun's LDAP provider. This had the advantage of near-transparency.

        First problem (JR only): configuration of poolsizes etc. is on a per-VM
        basis. The only choice available when making a connection to a specific
        directory is whether or not to pool connections. Since our product may need
        to work with several directories with radically different characteristics we
        decided to progress with this until we had a direct (new customer)
        requirement in order to expedite our product roadmap.

        Second problem (JR / Comcast): We implemented a simple round-robin
        load-balancing algorithm against multiple eTrust DSAs by leveraging the
        capability of Sun's LDAP SP to accept a list of URLs in its
        Context.PROVIDER_URL parameter. For each new InitialDirContext request we
        rotate the URLs so that each request goes to a different DSA. We believe
        that the connection pooling creates a pool based on the host/port of the
        URL, which means that with for example 3 urls (e.g. "ldap://a:389
        ldap://a:10389 ldap://a:20389") we would end up with 3 connection pools.

        Third problem (JR / Comcast): Lost TCP packets on a customer network. The
        root cause was a misconfigured VPN, which prevented search results returning
        to the LDAP client. The effect was that the thread executing the
        LdapRequest in the application server hung waiting for a response until the
        TCP socket timed out after 2 hours. This quickly led to all execute threads
        being stuck. We reproduced this in our lab in London by heavily exercising
        a DSA (outside of APS) until its response to APS mimicked the misconfigured
        VPN, while these two scenarios are extreme APS is required to be always
        available and hence is required to survive operation issue

        Setting search timeouts using SearchControls didn't help as this mechanism
        relies on the directory server aborting the search request if it takes too
        long. Our problem was not the duration of the search itself but the
        non-arrival of the results.

        Ldap socket factories
        =====================

        To avoid the stuck execute threads we decided we had to implement a timeout
        on LDAP sockets. The supported mechanism for this is to install a
        SocketFactory for LDAP requests, and within that SocketFactory to set the
        socket's SO_TIMEOUT. The name of the socket factory's class is used as a
        parameter during connection to the directory.

        This worked inasmuch as it caused the socket to fail when it waited too long
        for a response -- which we could then detect and apply a suitable retry
        strategy to -- but had other problems.

        Firstly (JR only): there is no easy way to configure different timeouts for
        different directories. Because the only parameter available is the name of
        the socket factory, it is necessary to implement a distinct physical class
        for each directory. Within the factory it seems that only the
        createSocket() method is used -- so we cannot even use the destination
        host/port to decide on the timeout to use.

        Secondly (JR / Comcast): the use of a custom LDAP socket factory explicitly
        disables sun's connection pooling!

        Thirdly (JR / Comcast): the timeout appears to cause the socket to fail when
        idle. We assume that the service provider is listening for unsolicited
        notifications and timing out. Once a socket fails the connection is no
        longer usable.

        Our Current solution
        ====================

        We have stopped using the connection pooling in Sun's LDAP service provider
        and implemented our own context pool instead. We are still using the socket
        timeout mechanism. Our PoolableLdapContext implements DirContext and
        delegates requests to a real InitialDirContext. It identifies
        timeout-related failures of LDAP operations and retries an appropriate
        number of times. When idle it "pings" the directory by searching for a
        nonexistent entry to stop the idle socket from timing
        out. It recovers from failures by making a new InitialDirContext (because we
        found that after the socket times out it remains unusable.)

        JR's Wishlist
        =============

        (1) Allow com.sun.jndi.ldap.connect.pool.* parameters passed in to the
        InitialDirContext environment to override the System Properties, so that
        each directory server can have its own pooling characteristics.

        (2) Make it easier to specify socket parameters. One way would be to make
        the InitialDirContext's environment available as a parameter to the
        createSocket calls. An alternative would be to pass an instance of the
        socket factory in the environment rather than just its name.

        (3) Don't disable connection pooling when a custom socket factory is used.

        (4) Provide a direct mechanism for specifying the timeout (without requiring
        a socket factory).

        (5) Is it necessary that unused sockets timeout? We are not explicitly
        using UnsolicitedNotifications, so why is there a read() on the socket in
        the first place?

        (6) Load balancing/round robining. Where a pooled context is created with
        multiple LDAP URLs in PROVIDER_URL, allow the choice of using them
        (a) as a sequence of fallback directories.
        (b) as a set of peer directory agents to be cycled through (round
        robin)
        (c) as a set of peer directory gents to be load balanced (i.e. pass
        requests to the one with the fewest outstanding requests).
        ###@###.### 10/11/04 07:18 GMT

              jhangalsunw Jayalaxmi Hangal (Inactive)
              rabarker Rich Barker (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

                Created:
                Updated:
                Resolved:
                Imported:
                Indexed: