-
Enhancement
-
Resolution: Fixed
-
P4
-
6
-
None
-
b40
-
sparc
-
solaris_9
Issue | Fix Version | Assignee | Priority | Status | Resolution | Resolved In Build |
---|---|---|---|---|---|---|
JDK-2127952 | 5.0u5 | Jayalaxmi Hangal | P4 | Resolved | Fixed | b04 |
Jacobs Rimell is providing a solution for Comcast.
They've run into some issues and want us to be aware of them and have made some
suggestions for some changes that they feel would enhance the interface.
Background Info
================
Jacobs Rimell (JR) have a product called APS. The current version (APS4) is
a J2EE application that requires LDAP access to multiple LDAP directories
from within an app-server.
At Comcast there is only one Directory (the GDS), however some of this email
reflects APS requirements as well as Comcast's requirements on APS. Given
that Comcast contacted you originally I have deliberately indicated where
the issue /current implementation has a direct impact on the customer.
JR Issues with LDAP connection pooling
======================================
Originally we tried to use the inbuilt connection pooling in recent releases
of Sun's LDAP provider. This had the advantage of near-transparency.
First problem (JR only): configuration of poolsizes etc. is on a per-VM
basis. The only choice available when making a connection to a specific
directory is whether or not to pool connections. Since our product may need
to work with several directories with radically different characteristics we
decided to progress with this until we had a direct (new customer)
requirement in order to expedite our product roadmap.
Second problem (JR / Comcast): We implemented a simple round-robin
load-balancing algorithm against multiple eTrust DSAs by leveraging the
capability of Sun's LDAP SP to accept a list of URLs in its
Context.PROVIDER_URL parameter. For each new InitialDirContext request we
rotate the URLs so that each request goes to a different DSA. We believe
that the connection pooling creates a pool based on the host/port of the
URL, which means that with for example 3 urls (e.g. "ldap://a:389
ldap://a:10389 ldap://a:20389") we would end up with 3 connection pools.
Third problem (JR / Comcast): Lost TCP packets on a customer network. The
root cause was a misconfigured VPN, which prevented search results returning
to the LDAP client. The effect was that the thread executing the
LdapRequest in the application server hung waiting for a response until the
TCP socket timed out after 2 hours. This quickly led to all execute threads
being stuck. We reproduced this in our lab in London by heavily exercising
a DSA (outside of APS) until its response to APS mimicked the misconfigured
VPN, while these two scenarios are extreme APS is required to be always
available and hence is required to survive operation issue
Setting search timeouts using SearchControls didn't help as this mechanism
relies on the directory server aborting the search request if it takes too
long. Our problem was not the duration of the search itself but the
non-arrival of the results.
Ldap socket factories
=====================
To avoid the stuck execute threads we decided we had to implement a timeout
on LDAP sockets. The supported mechanism for this is to install a
SocketFactory for LDAP requests, and within that SocketFactory to set the
socket's SO_TIMEOUT. The name of the socket factory's class is used as a
parameter during connection to the directory.
This worked inasmuch as it caused the socket to fail when it waited too long
for a response -- which we could then detect and apply a suitable retry
strategy to -- but had other problems.
Firstly (JR only): there is no easy way to configure different timeouts for
different directories. Because the only parameter available is the name of
the socket factory, it is necessary to implement a distinct physical class
for each directory. Within the factory it seems that only the
createSocket() method is used -- so we cannot even use the destination
host/port to decide on the timeout to use.
Secondly (JR / Comcast): the use of a custom LDAP socket factory explicitly
disables sun's connection pooling!
Thirdly (JR / Comcast): the timeout appears to cause the socket to fail when
idle. We assume that the service provider is listening for unsolicited
notifications and timing out. Once a socket fails the connection is no
longer usable.
Our Current solution
====================
We have stopped using the connection pooling in Sun's LDAP service provider
and implemented our own context pool instead. We are still using the socket
timeout mechanism. Our PoolableLdapContext implements DirContext and
delegates requests to a real InitialDirContext. It identifies
timeout-related failures of LDAP operations and retries an appropriate
number of times. When idle it "pings" the directory by searching for a
nonexistent entry to stop the idle socket from timing
out. It recovers from failures by making a new InitialDirContext (because we
found that after the socket times out it remains unusable.)
JR's Wishlist
=============
(1) Allow com.sun.jndi.ldap.connect.pool.* parameters passed in to the
InitialDirContext environment to override the System Properties, so that
each directory server can have its own pooling characteristics.
(2) Make it easier to specify socket parameters. One way would be to make
the InitialDirContext's environment available as a parameter to the
createSocket calls. An alternative would be to pass an instance of the
socket factory in the environment rather than just its name.
(3) Don't disable connection pooling when a custom socket factory is used.
(4) Provide a direct mechanism for specifying the timeout (without requiring
a socket factory).
(5) Is it necessary that unused sockets timeout? We are not explicitly
using UnsolicitedNotifications, so why is there a read() on the socket in
the first place?
(6) Load balancing/round robining. Where a pooled context is created with
multiple LDAP URLs in PROVIDER_URL, allow the choice of using them
(a) as a sequence of fallback directories.
(b) as a set of peer directory agents to be cycled through (round
robin)
(c) as a set of peer directory gents to be load balanced (i.e. pass
requests to the one with the fewest outstanding requests).
###@###.### 10/11/04 07:18 GMT
They've run into some issues and want us to be aware of them and have made some
suggestions for some changes that they feel would enhance the interface.
Background Info
================
Jacobs Rimell (JR) have a product called APS. The current version (APS4) is
a J2EE application that requires LDAP access to multiple LDAP directories
from within an app-server.
At Comcast there is only one Directory (the GDS), however some of this email
reflects APS requirements as well as Comcast's requirements on APS. Given
that Comcast contacted you originally I have deliberately indicated where
the issue /current implementation has a direct impact on the customer.
JR Issues with LDAP connection pooling
======================================
Originally we tried to use the inbuilt connection pooling in recent releases
of Sun's LDAP provider. This had the advantage of near-transparency.
First problem (JR only): configuration of poolsizes etc. is on a per-VM
basis. The only choice available when making a connection to a specific
directory is whether or not to pool connections. Since our product may need
to work with several directories with radically different characteristics we
decided to progress with this until we had a direct (new customer)
requirement in order to expedite our product roadmap.
Second problem (JR / Comcast): We implemented a simple round-robin
load-balancing algorithm against multiple eTrust DSAs by leveraging the
capability of Sun's LDAP SP to accept a list of URLs in its
Context.PROVIDER_URL parameter. For each new InitialDirContext request we
rotate the URLs so that each request goes to a different DSA. We believe
that the connection pooling creates a pool based on the host/port of the
URL, which means that with for example 3 urls (e.g. "ldap://a:389
ldap://a:10389 ldap://a:20389") we would end up with 3 connection pools.
Third problem (JR / Comcast): Lost TCP packets on a customer network. The
root cause was a misconfigured VPN, which prevented search results returning
to the LDAP client. The effect was that the thread executing the
LdapRequest in the application server hung waiting for a response until the
TCP socket timed out after 2 hours. This quickly led to all execute threads
being stuck. We reproduced this in our lab in London by heavily exercising
a DSA (outside of APS) until its response to APS mimicked the misconfigured
VPN, while these two scenarios are extreme APS is required to be always
available and hence is required to survive operation issue
Setting search timeouts using SearchControls didn't help as this mechanism
relies on the directory server aborting the search request if it takes too
long. Our problem was not the duration of the search itself but the
non-arrival of the results.
Ldap socket factories
=====================
To avoid the stuck execute threads we decided we had to implement a timeout
on LDAP sockets. The supported mechanism for this is to install a
SocketFactory for LDAP requests, and within that SocketFactory to set the
socket's SO_TIMEOUT. The name of the socket factory's class is used as a
parameter during connection to the directory.
This worked inasmuch as it caused the socket to fail when it waited too long
for a response -- which we could then detect and apply a suitable retry
strategy to -- but had other problems.
Firstly (JR only): there is no easy way to configure different timeouts for
different directories. Because the only parameter available is the name of
the socket factory, it is necessary to implement a distinct physical class
for each directory. Within the factory it seems that only the
createSocket() method is used -- so we cannot even use the destination
host/port to decide on the timeout to use.
Secondly (JR / Comcast): the use of a custom LDAP socket factory explicitly
disables sun's connection pooling!
Thirdly (JR / Comcast): the timeout appears to cause the socket to fail when
idle. We assume that the service provider is listening for unsolicited
notifications and timing out. Once a socket fails the connection is no
longer usable.
Our Current solution
====================
We have stopped using the connection pooling in Sun's LDAP service provider
and implemented our own context pool instead. We are still using the socket
timeout mechanism. Our PoolableLdapContext implements DirContext and
delegates requests to a real InitialDirContext. It identifies
timeout-related failures of LDAP operations and retries an appropriate
number of times. When idle it "pings" the directory by searching for a
nonexistent entry to stop the idle socket from timing
out. It recovers from failures by making a new InitialDirContext (because we
found that after the socket times out it remains unusable.)
JR's Wishlist
=============
(1) Allow com.sun.jndi.ldap.connect.pool.* parameters passed in to the
InitialDirContext environment to override the System Properties, so that
each directory server can have its own pooling characteristics.
(2) Make it easier to specify socket parameters. One way would be to make
the InitialDirContext's environment available as a parameter to the
createSocket calls. An alternative would be to pass an instance of the
socket factory in the environment rather than just its name.
(3) Don't disable connection pooling when a custom socket factory is used.
(4) Provide a direct mechanism for specifying the timeout (without requiring
a socket factory).
(5) Is it necessary that unused sockets timeout? We are not explicitly
using UnsolicitedNotifications, so why is there a read() on the socket in
the first place?
(6) Load balancing/round robining. Where a pooled context is created with
multiple LDAP URLs in PROVIDER_URL, allow the choice of using them
(a) as a sequence of fallback directories.
(b) as a set of peer directory agents to be cycled through (round
robin)
(c) as a set of peer directory gents to be load balanced (i.e. pass
requests to the one with the fewest outstanding requests).
###@###.### 10/11/04 07:18 GMT
- backported by
-
JDK-2127952 Request for improvements to javax.naming.directory
- Resolved
- relates to
-
JDK-6288554 Context specific configuration of pooling properties.
- Open
-
JDK-6176036 Require a way to specify read timeout for LDAP operations
- Resolved
-
JDK-6176045 Make it easier to specify the SocketFactory to the Context
- Closed