-
Bug
-
Resolution: Cannot Reproduce
-
P2
-
None
-
1.3.0
-
sparc
-
solaris_8
Overview
We have put in place an SNMP data collector or "polling system" using Sun's JDK and JDMK development framework.
We actually have the following problem:
One of the multithreaded server (Socket_Corba server) seems to be unable to "reach" another server (SnmpCorba). The communication between those two S/W components or servers is done through Corba using the ORB provided with the J2SE (formally the JDK). However, the server SnmpCorba remains accessible from the other external Java applications to the JVM of the Socket_Corba server having the problem.
General Architecture
There is one server that does all the snmp data collection (SnmpCorba). This server receives a request "x", and then in return the client receives a unique identifier which allows it to extract the resulting data. The results is then put in a hash table with a given structure and in which the key is the unique identifier referenced above;
Several clients can interrogate the Corba server;
One server receives all requests via socket and interrogate the SnmpCorba server (SnmpCorba) that does the snmp data collection. This server is multithreaded (approximately 40 threads steady state). Here is a brief description of how things get done;
1.The thread tests that the SnmpCorba server is alive invoquing the isAlive() method of the SnmpCorba server;
2.Read and validate the data on the socket;
3.The thread sends a request to the SnmpCorba server;
4.Then switch in a data extraction mode;
5.Returns the result on the socket;
6.Close socket and reinitialize (null) the working variables;
Note: it is important to note that the traces indicate that the threads "jam" or is blocked somewhere in the steps described just above.
All clients of the SnmpCorba server are using the same class to interrogate the SnmpCorba server in order to recuperate the collected data. The methods which have access to the server are "protected" such to insure the good execution order and also to hide the internal structure.
Problem description
The problem with the Socket_Corba server is as follows:
After so many requests, let say "x" processed without problems, the Socket_Corba server will continue to accept the requests on the socket. However, one or more threads remains blocked on the isAlive() method of the SnmpCorba server.
This problem appear in a random fashion but we have observed that this problem happens when the requests sent to the SnmpCorba server were small and so the result was obtained quickly. However, the SnmpCorba remains reachable from the other clients (external from the JVM which that has the problem) and we observe that response time are also excellent.
The general algoritm applied to the data extraction is as follows:
While (client does not have all data)
The client interrogate the server to know if it has data for the UID "z";
If data is available then
Get the data
else (sleep 300 msec)
Endwhile
Notes:
1.We think that the above data extraction algoritm is the source of the problem but we don't know where and how?
2.UID = unique identifier referenced above
Questions
Does the fact that we interrogate SnmpCorba server with so many threads (let say "x") at a rate or interval of 300 msec to obtain the data could cause the problem. If so, how?
It is important to know that there are no callback mechanism to return the data to the client but we however have the plan to implement such mechanism. However, before doing so, the load on the SnmpCorba server is likely to increase. So we would like to know if the ORB of the JDK can sustain an additional load and if so to what extent? In other words, it would certainly be a good idea to understand the limitations of such components or any specifics that we should know and understand when using it;
Configuration (hardware):
Sun server E220R (2CPU 400Mhz, 1 Gbytes RAM and 1 Gbytes swap);
Configuration (software):
J2se 1.3.0_02 installed;
VL-MSS-SR003-FAE1 (nms): java -version
Java version "1.3.0_02"
Java? 2 Runtime Environment, Standard Edition (build 1.3.0_02)
Java HotSpot? Client VM (build 1.3.0_02, mixed mode)
Test case follows:
Sofware implementation exemple
----------------------------------------------------------------------------------------------------------
-- Recuperation d'une reference sur le serveur SnmpCorba
......
// create and initialize the ORB
ORB orb = ORB.init(args, null);
// Get the stringified object reference and destringify it.
String filename = nmsDir + "/ior/SnmpServer.ior";
BufferedReader br = new BufferedReader(new FileReader(filename));
String ior = br.readLine();
org.omg.CORBA.Object obj = orb.string_to_object(ior);
snmpServer = SnmpServerCorbaHelper.narrow(obj);
snmpServer.isAlive();
.......
----------------------------------------------------------------------------------------------------------
-- IDL du serveur SnmpCorba
module nms
{
module polling
{
module snmp{
module corba{
interface SnmpServerCorba
{
string WorkOnN_IpOid(in StructN_IpOid structN_IpOid);
string addWorkN_IpOid(in StructN_IpOid structN_IpOid,in string id);
string WorkOnNIp_Oid(in StructNIp_Oid structNIp_Oid);
string addWorkNIp_Oid(in StructNIp_Oid structNIp_Oid,in string id);
boolean resultIsAvailable(in string id);
NetworkObjectSeq getResult(in string id);
string getTable(in StructGetTableData structGetTableData);
boolean getTableResulIsAvailable(in string id);
GetTableNetworkObjectSeq getTableResult(in string id);
void deleteId(in string id);
boolean isAlive();
string setDebugLevel(in long level);
nms::util::stringSeq listId();
};
};
};
};
};
struct StructNIp_Oid
{
string action;
long level;
nms::util::stringSeq tabKey;
nms::util::stringSeq tabIp;
nms::util::stringSeq tabOid;
};
struct StructSnmpData
{
string ip ;
nms::util::stringSeq tabKey;
nms::util::stringSeq tabOid;
};
typedef sequence<StructSnmpData> StructSnmpDataSeq;
};
struct StructN_IpOid
{
string action;
long level;
StructSnmpDataSeq structSnmpData;
};
----------------------------------------------------------------------------------------------------------
-- implantation de la methode isAlive
public boolean isAlive()
{
return true;
}
Please see comments section for full synopsis of test environment platform...
Please comments section for description of where to find the test case...
We have put in place an SNMP data collector or "polling system" using Sun's JDK and JDMK development framework.
We actually have the following problem:
One of the multithreaded server (Socket_Corba server) seems to be unable to "reach" another server (SnmpCorba). The communication between those two S/W components or servers is done through Corba using the ORB provided with the J2SE (formally the JDK). However, the server SnmpCorba remains accessible from the other external Java applications to the JVM of the Socket_Corba server having the problem.
General Architecture
There is one server that does all the snmp data collection (SnmpCorba). This server receives a request "x", and then in return the client receives a unique identifier which allows it to extract the resulting data. The results is then put in a hash table with a given structure and in which the key is the unique identifier referenced above;
Several clients can interrogate the Corba server;
One server receives all requests via socket and interrogate the SnmpCorba server (SnmpCorba) that does the snmp data collection. This server is multithreaded (approximately 40 threads steady state). Here is a brief description of how things get done;
1.The thread tests that the SnmpCorba server is alive invoquing the isAlive() method of the SnmpCorba server;
2.Read and validate the data on the socket;
3.The thread sends a request to the SnmpCorba server;
4.Then switch in a data extraction mode;
5.Returns the result on the socket;
6.Close socket and reinitialize (null) the working variables;
Note: it is important to note that the traces indicate that the threads "jam" or is blocked somewhere in the steps described just above.
All clients of the SnmpCorba server are using the same class to interrogate the SnmpCorba server in order to recuperate the collected data. The methods which have access to the server are "protected" such to insure the good execution order and also to hide the internal structure.
Problem description
The problem with the Socket_Corba server is as follows:
After so many requests, let say "x" processed without problems, the Socket_Corba server will continue to accept the requests on the socket. However, one or more threads remains blocked on the isAlive() method of the SnmpCorba server.
This problem appear in a random fashion but we have observed that this problem happens when the requests sent to the SnmpCorba server were small and so the result was obtained quickly. However, the SnmpCorba remains reachable from the other clients (external from the JVM which that has the problem) and we observe that response time are also excellent.
The general algoritm applied to the data extraction is as follows:
While (client does not have all data)
The client interrogate the server to know if it has data for the UID "z";
If data is available then
Get the data
else (sleep 300 msec)
Endwhile
Notes:
1.We think that the above data extraction algoritm is the source of the problem but we don't know where and how?
2.UID = unique identifier referenced above
Questions
Does the fact that we interrogate SnmpCorba server with so many threads (let say "x") at a rate or interval of 300 msec to obtain the data could cause the problem. If so, how?
It is important to know that there are no callback mechanism to return the data to the client but we however have the plan to implement such mechanism. However, before doing so, the load on the SnmpCorba server is likely to increase. So we would like to know if the ORB of the JDK can sustain an additional load and if so to what extent? In other words, it would certainly be a good idea to understand the limitations of such components or any specifics that we should know and understand when using it;
Configuration (hardware):
Sun server E220R (2CPU 400Mhz, 1 Gbytes RAM and 1 Gbytes swap);
Configuration (software):
J2se 1.3.0_02 installed;
VL-MSS-SR003-FAE1 (nms): java -version
Java version "1.3.0_02"
Java? 2 Runtime Environment, Standard Edition (build 1.3.0_02)
Java HotSpot? Client VM (build 1.3.0_02, mixed mode)
Test case follows:
Sofware implementation exemple
----------------------------------------------------------------------------------------------------------
-- Recuperation d'une reference sur le serveur SnmpCorba
......
// create and initialize the ORB
ORB orb = ORB.init(args, null);
// Get the stringified object reference and destringify it.
String filename = nmsDir + "/ior/SnmpServer.ior";
BufferedReader br = new BufferedReader(new FileReader(filename));
String ior = br.readLine();
org.omg.CORBA.Object obj = orb.string_to_object(ior);
snmpServer = SnmpServerCorbaHelper.narrow(obj);
snmpServer.isAlive();
.......
----------------------------------------------------------------------------------------------------------
-- IDL du serveur SnmpCorba
module nms
{
module polling
{
module snmp{
module corba{
interface SnmpServerCorba
{
string WorkOnN_IpOid(in StructN_IpOid structN_IpOid);
string addWorkN_IpOid(in StructN_IpOid structN_IpOid,in string id);
string WorkOnNIp_Oid(in StructNIp_Oid structNIp_Oid);
string addWorkNIp_Oid(in StructNIp_Oid structNIp_Oid,in string id);
boolean resultIsAvailable(in string id);
NetworkObjectSeq getResult(in string id);
string getTable(in StructGetTableData structGetTableData);
boolean getTableResulIsAvailable(in string id);
GetTableNetworkObjectSeq getTableResult(in string id);
void deleteId(in string id);
boolean isAlive();
string setDebugLevel(in long level);
nms::util::stringSeq listId();
};
};
};
};
};
struct StructNIp_Oid
{
string action;
long level;
nms::util::stringSeq tabKey;
nms::util::stringSeq tabIp;
nms::util::stringSeq tabOid;
};
struct StructSnmpData
{
string ip ;
nms::util::stringSeq tabKey;
nms::util::stringSeq tabOid;
};
typedef sequence<StructSnmpData> StructSnmpDataSeq;
};
struct StructN_IpOid
{
string action;
long level;
StructSnmpDataSeq structSnmpData;
};
----------------------------------------------------------------------------------------------------------
-- implantation de la methode isAlive
public boolean isAlive()
{
return true;
}
Please see comments section for full synopsis of test environment platform...
Please comments section for description of where to find the test case...