-
Bug
-
Resolution: Not an Issue
-
P2
-
6u5
-
generic
-
generic
Issue | Fix Version | Assignee | Priority | Status | Resolution | Resolved In Build |
---|---|---|---|---|---|---|
JDK-2163885 | 5.0-pool | Abhijit Saha | P2 | Closed | Not an Issue |
Description
Application server: SJSAS 9.1u1
Java version: 1.6.0_05
OS: All
Problem:
Customer is running a web service and they noticed that under load about 1 in 1000 requests fail with various errors that would indicate that some portions of the XML data were missing. They ran network traces as the first step in debugging the problem, but those traces showed that the complete XML document was received by the application server.
The customer noticed NullPointerExceptions when trying to reference objects that should have been generated by the webservice mechanism based on the submitted SOAP requests. They also noticed the following stack trace coming from their SOAPHandler.
org.w3c.dom.DOMException: NOT_FOUND_ERR: An attempt is made to reference a node in a context where it does not exist.
com.sun.org.apache.xerces.internal.dom.ParentNode.internalRemoveChild(ParentNode.java:497)
com.sun.org.apache.xerces.internal.dom.ParentNode.removeChild(ParentNode.java:478)
com.sun.org.apache.xerces.internal.dom.ParentNode.internalInsertBefore(ParentNode.java:387)
com.sun.org.apache.xerces.internal.dom.ParentNode.insertBefore(ParentNode.java:284)
com.sun.xml.messaging.saaj.soap.impl.ElementImpl.replaceElementWithSOAPElement(ElementImpl.java:715)
com.sun.xml.messaging.saaj.soap.impl.BodyImpl.convertToSoapElement(BodyImpl.java:287)
com.sun.xml.messaging.saaj.soap.impl.ElementImpl$5.next(ElementImpl.java:1169)
com.appriss.methcheck.ws.MessageFaultHandler.handleMessage(MessageFaultHandler.java:170)
com.appriss.methcheck.ws.MessageFaultHandler.handleMessage(MessageFaultHandler.java:27)
com.sun.xml.ws.handler.HandlerProcessor.callHandleMessage(HandlerProcessor.java:284)
com.sun.xml.ws.handler.HandlerProcessor.callHandlersRequest(HandlerProcessor.java:135)
com.sun.xml.ws.handler.ServerSOAPHandlerTube.callHandlersOnRequest(ServerSOAPHandlerTube.java:181)
com.sun.xml.ws.handler.HandlerTube.processRequest(HandlerTube.java:114)
com.sun.xml.ws.api.pipe.Fiber.__doRun(Fiber.java:595)
com.sun.xml.ws.api.pipe.Fiber._doRun(Fiber.java:554)
com.sun.xml.ws.api.pipe.Fiber.doRun(Fiber.java:539)
com.sun.xml.ws.api.pipe.Fiber.runSync(Fiber.java:436)
com.sun.xml.ws.api.pipe.helper.AbstractTubeImpl.process(AbstractTubeImpl.java:106)
com.sun.enterprise.webservice.MonitoringPipe.process(MonitoringPipe.java:147)
com.sun.xml.ws.api.pipe.helper.PipeAdapter.processRequest(PipeAdapter.java:115)
com.sun.xml.ws.api.pipe.Fiber.__doRun(Fiber.java:595)
com.sun.xml.ws.api.pipe.Fiber._doRun(Fiber.java:554)
com.sun.xml.ws.api.pipe.Fiber.doRun(Fiber.java:539)
com.sun.xml.ws.api.pipe.Fiber.runSync(Fiber.java:436)
com.sun.xml.ws.api.pipe.helper.AbstractTubeImpl.process(AbstractTubeImpl.java:106)
com.sun.enterprise.webservice.CommonServerSecurityPipe$1.run(CommonServerSecurityPipe.java:232)
java.security.AccessController.doPrivileged(Native Method)
javax.security.auth.Subject.doAsPrivileged(Subject.java:517)
com.sun.enterprise.webservice.CommonServerSecurityPipe.processRequest(CommonServerSecurityPipe.java:228)
com.sun.enterprise.webservice.CommonServerSecurityPipe.process(CommonServerSecurityPipe.java:129)
com.sun.xml.ws.api.pipe.helper.PipeAdapter.processRequest(PipeAdapter.java:115)
com.sun.xml.ws.api.pipe.Fiber.__doRun(Fiber.java:595)
com.sun.xml.ws.api.pipe.Fiber._doRun(Fiber.java:554)
com.sun.xml.ws.api.pipe.Fiber.doRun(Fiber.java:539)
com.sun.xml.ws.api.pipe.Fiber.runSync(Fiber.java:436)
com.sun.xml.ws.server.WSEndpointImpl$2.process(WSEndpointImpl.java:243)
com.sun.xml.ws.transport.http.HttpAdapter$HttpToolkit.handle(HttpAdapter.java:444)
com.sun.xml.ws.transport.http.HttpAdapter.handle(HttpAdapter.java:244)
com.sun.xml.ws.transport.http.servlet.ServletAdapter.handle(ServletAdapter.java:135)
com.sun.enterprise.webservice.Ejb3MessageDispatcher.handlePost(Ejb3MessageDispatcher.java:113)
com.sun.enterprise.webservice.Ejb3MessageDispatcher.invoke(Ejb3MessageDispatcher.java:87)
com.sun.enterprise.webservice.EjbWebServiceServlet.dispatchToEjbEndpoint(EjbWebServiceServlet.java:200)
com.sun.enterprise.webservice.EjbWebServiceServlet.service(EjbWebServiceServlet.java:129)
javax.servlet.http.HttpServlet.service(HttpServlet.java:831)
com.sun.enterprise.web.AdHocContextValve.invoke(AdHocContextValve.java:114)
org.apache.catalina.core.StandardPipeline.doInvoke(StandardPipeline.java:632)
org.apache.catalina.core.StandardPipeline.doInvoke(StandardPipeline.java:577)
org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.java:571)
com.sun.enterprise.web.WebPipeline.invoke(WebPipeline.java:87)
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:206)
org.apache.catalina.core.StandardPipeline.doInvoke(StandardPipeline.java:632)
org.apache.catalina.core.StandardPipeline.doInvoke(StandardPipeline.java:577)
org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.java:571)
org.apache.catalina.core.ContainerBase.invoke(ContainerBase.java:1080)
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:150)
org.apache.catalina.core.StandardPipeline.doInvoke(StandardPipeline.java:632)
org.apache.catalina.core.StandardPipeline.doInvoke(StandardPipeline.java:577)
org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.java:571)
org.apache.catalina.core.ContainerBase.invoke(ContainerBase.java:1080)
org.apache.coyote.tomcat5.CoyoteAdapter.service(CoyoteAdapter.java:272)
com.sun.enterprise.web.connector.grizzly.DefaultProcessorTask.invokeAdapter(DefaultProcessorTask.java:637)
com.sun.enterprise.web.connector.grizzly.DefaultProcessorTask.doProcess(DefaultProcessorTask.java:568)
com.sun.enterprise.web.connector.grizzly.DefaultProcessorTask.process(DefaultProcessorTask.java:813)
com.sun.enterprise.web.connector.grizzly.DefaultReadTask.executeProcessorTask(DefaultReadTask.java:341)
com.sun.enterprise.web.connector.grizzly.DefaultReadTask.doTask(DefaultReadTask.java:263)
com.sun.enterprise.web.connector.grizzly.DefaultReadTask.doTask(DefaultReadTask.java:214)
com.sun.enterprise.web.portunif.PortUnificationPipeline$PUTask.doTask(PortUnificationPipeline.java:380)
com.sun.enterprise.web.connector.grizzly.TaskBase.run(TaskBase.java:265)
com.sun.enterprise.web.connector.grizzly.ssl.SSLWorkerThread.run(SSLWorkerThread.java:106)
The source code that results in the problem is attached (MessageFaultHandler.java-with_problem).
Customer also provided a modified source code that decreases the frequency of the problem (MessageFaultHandler.java-with_problem_less_frequent). The modified code where they walk the DOM tree in the
SOAPHandler doesn't eliminate the problem, it only makes it less likely to happen. It changed the problem from happening about 1 in 1000 transactions to 1 in 10000 transactions.
It seems that under high load there is an intermittent bug somewhere in the DOM->SOAP processing which happens less frequently if the DOM tree is walked by the SOAPHandler prior to SOAP related access.
In the attached java files note that the significant change that seems to allows them to reduce the frequency of the bug is the change from using SOAP tree access to DOM tree access, specifically request.getSOAPBody().getChildElements() changed to request.getSOAPBody().getChildNodes().
You can deploy the attached ear file(MethCheck.ear) into a SJSAS install to run the web service without any dependencies. Also attached is a command line client (WSTest.tgz) that you can use to load the web service and generate the error.
For this test the web service simply writes what it found to System.out. You can grep for the string "element is null" to find instances where the problem occurred:
users-jsterr-mac:/Applications/NetBeans/glassfish-v2/domains/domain1/logs jsterr$ grep "element is null" server*
server.log_2008-05-05T17-37-01:person-info element is null|#]
server.log_2008-05-05T17-37-04:person-info element is null|#]
To run the client you have to specify 3 parameters, the number of threads, the number of iterations for each thread,
and the WSDL URL. Here is an example of running it on the same server as SJSAS:
java -jar WSTest.jar 8 2000 http://localhost:8080/methCheckService/MethCheckWS?wsdl
The errors:
org.w3c.dom.DOMException: NOT_FOUND_ERR: An attempt is made to reference a node in a context where it does not exist.
AND
users-jsterr-mac:/Applications/NetBeans/glassfish-v2/domains/domain1/logs jsterr$ grep "element is null" server*
server.log_2008-05-05T17-37-01:person-info element is null|#]
server.log_2008-05-05T17-37-04:person-info element is null|#]
are independent. Customer believes one error or the other error happens depending on which way the tree gets messed up.
The problem is best reproducible by running the client and server on different machines, and both machines have multiple CPU cores so there was concurrency in both the client and server.
Customer also mentioned that based on further testing while trying to put together this example ear file it seems that the problem does not occur if there is no SOAPHandler. Customer has tested the full application with the SOAPHandlers removed. There were no errors after 62,000 iterations.
The SOAPHandlers can be removed with just an XML config change. Inside of MethCheck.ear is MethCheckEJB.jar. Inside the jar are 2 copies of the file MethCheckWS_handler.xml. If you replace the original files with the attached file (MethCheckWS_handler.xml) it will remove the handlers from the chain.
MethCheckWS_handler.xml with SOAPHandler
========================================
<?xml version="1.0" encoding="UTF-8"?>
<handler-chains xmlns="http://java.sun.com/xml/ns/javaee">
<handler-chain>
<handler>
<handler-name>com.appriss.methcheck.ws.WhitespaceMessageHandler</handler-name>
<handler-class>com.appriss.methcheck.ws.WhitespaceMessageHandler</handler-class>
</handler>
<handler>
<handler-name>com.appriss.methcheck.ws.MessageFaultHandler</handler-name>
<handler-class>com.appriss.methcheck.ws.MessageFaultHandler</handler-class>
</handler>
</handler-chain>
</handler-chains>
MethCheckWS_handler.xml without SOAPHandler
========================================
<?xml version="1.0" encoding="UTF-8"?>
<handler-chains xmlns="http://java.sun.com/xml/ns/javaee">
</handler-chains>
The customer believe that this an Appserver/Java bug since the handler is pretty simple, and when it gets an exception it is in code that is just trying to walk the tree. They can run without the handler as an interim solution but they will lose some logging by doing so. They would like to add it back in when a fix is available.
Please refer to CR 6700018 for the attachments.
Application server: SJSAS 9.1u1
Java version: 1.6.0_05
OS: All
Problem:
Customer is running a web service and they noticed that under load about 1 in 1000 requests fail with various errors that would indicate that some portions of the XML data were missing. They ran network traces as the first step in debugging the problem, but those traces showed that the complete XML document was received by the application server.
The customer noticed NullPointerExceptions when trying to reference objects that should have been generated by the webservice mechanism based on the submitted SOAP requests. They also noticed the following stack trace coming from their SOAPHandler.
org.w3c.dom.DOMException: NOT_FOUND_ERR: An attempt is made to reference a node in a context where it does not exist.
com.sun.org.apache.xerces.internal.dom.ParentNode.internalRemoveChild(ParentNode.java:497)
com.sun.org.apache.xerces.internal.dom.ParentNode.removeChild(ParentNode.java:478)
com.sun.org.apache.xerces.internal.dom.ParentNode.internalInsertBefore(ParentNode.java:387)
com.sun.org.apache.xerces.internal.dom.ParentNode.insertBefore(ParentNode.java:284)
com.sun.xml.messaging.saaj.soap.impl.ElementImpl.replaceElementWithSOAPElement(ElementImpl.java:715)
com.sun.xml.messaging.saaj.soap.impl.BodyImpl.convertToSoapElement(BodyImpl.java:287)
com.sun.xml.messaging.saaj.soap.impl.ElementImpl$5.next(ElementImpl.java:1169)
com.appriss.methcheck.ws.MessageFaultHandler.handleMessage(MessageFaultHandler.java:170)
com.appriss.methcheck.ws.MessageFaultHandler.handleMessage(MessageFaultHandler.java:27)
com.sun.xml.ws.handler.HandlerProcessor.callHandleMessage(HandlerProcessor.java:284)
com.sun.xml.ws.handler.HandlerProcessor.callHandlersRequest(HandlerProcessor.java:135)
com.sun.xml.ws.handler.ServerSOAPHandlerTube.callHandlersOnRequest(ServerSOAPHandlerTube.java:181)
com.sun.xml.ws.handler.HandlerTube.processRequest(HandlerTube.java:114)
com.sun.xml.ws.api.pipe.Fiber.__doRun(Fiber.java:595)
com.sun.xml.ws.api.pipe.Fiber._doRun(Fiber.java:554)
com.sun.xml.ws.api.pipe.Fiber.doRun(Fiber.java:539)
com.sun.xml.ws.api.pipe.Fiber.runSync(Fiber.java:436)
com.sun.xml.ws.api.pipe.helper.AbstractTubeImpl.process(AbstractTubeImpl.java:106)
com.sun.enterprise.webservice.MonitoringPipe.process(MonitoringPipe.java:147)
com.sun.xml.ws.api.pipe.helper.PipeAdapter.processRequest(PipeAdapter.java:115)
com.sun.xml.ws.api.pipe.Fiber.__doRun(Fiber.java:595)
com.sun.xml.ws.api.pipe.Fiber._doRun(Fiber.java:554)
com.sun.xml.ws.api.pipe.Fiber.doRun(Fiber.java:539)
com.sun.xml.ws.api.pipe.Fiber.runSync(Fiber.java:436)
com.sun.xml.ws.api.pipe.helper.AbstractTubeImpl.process(AbstractTubeImpl.java:106)
com.sun.enterprise.webservice.CommonServerSecurityPipe$1.run(CommonServerSecurityPipe.java:232)
java.security.AccessController.doPrivileged(Native Method)
javax.security.auth.Subject.doAsPrivileged(Subject.java:517)
com.sun.enterprise.webservice.CommonServerSecurityPipe.processRequest(CommonServerSecurityPipe.java:228)
com.sun.enterprise.webservice.CommonServerSecurityPipe.process(CommonServerSecurityPipe.java:129)
com.sun.xml.ws.api.pipe.helper.PipeAdapter.processRequest(PipeAdapter.java:115)
com.sun.xml.ws.api.pipe.Fiber.__doRun(Fiber.java:595)
com.sun.xml.ws.api.pipe.Fiber._doRun(Fiber.java:554)
com.sun.xml.ws.api.pipe.Fiber.doRun(Fiber.java:539)
com.sun.xml.ws.api.pipe.Fiber.runSync(Fiber.java:436)
com.sun.xml.ws.server.WSEndpointImpl$2.process(WSEndpointImpl.java:243)
com.sun.xml.ws.transport.http.HttpAdapter$HttpToolkit.handle(HttpAdapter.java:444)
com.sun.xml.ws.transport.http.HttpAdapter.handle(HttpAdapter.java:244)
com.sun.xml.ws.transport.http.servlet.ServletAdapter.handle(ServletAdapter.java:135)
com.sun.enterprise.webservice.Ejb3MessageDispatcher.handlePost(Ejb3MessageDispatcher.java:113)
com.sun.enterprise.webservice.Ejb3MessageDispatcher.invoke(Ejb3MessageDispatcher.java:87)
com.sun.enterprise.webservice.EjbWebServiceServlet.dispatchToEjbEndpoint(EjbWebServiceServlet.java:200)
com.sun.enterprise.webservice.EjbWebServiceServlet.service(EjbWebServiceServlet.java:129)
javax.servlet.http.HttpServlet.service(HttpServlet.java:831)
com.sun.enterprise.web.AdHocContextValve.invoke(AdHocContextValve.java:114)
org.apache.catalina.core.StandardPipeline.doInvoke(StandardPipeline.java:632)
org.apache.catalina.core.StandardPipeline.doInvoke(StandardPipeline.java:577)
org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.java:571)
com.sun.enterprise.web.WebPipeline.invoke(WebPipeline.java:87)
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:206)
org.apache.catalina.core.StandardPipeline.doInvoke(StandardPipeline.java:632)
org.apache.catalina.core.StandardPipeline.doInvoke(StandardPipeline.java:577)
org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.java:571)
org.apache.catalina.core.ContainerBase.invoke(ContainerBase.java:1080)
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:150)
org.apache.catalina.core.StandardPipeline.doInvoke(StandardPipeline.java:632)
org.apache.catalina.core.StandardPipeline.doInvoke(StandardPipeline.java:577)
org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.java:571)
org.apache.catalina.core.ContainerBase.invoke(ContainerBase.java:1080)
org.apache.coyote.tomcat5.CoyoteAdapter.service(CoyoteAdapter.java:272)
com.sun.enterprise.web.connector.grizzly.DefaultProcessorTask.invokeAdapter(DefaultProcessorTask.java:637)
com.sun.enterprise.web.connector.grizzly.DefaultProcessorTask.doProcess(DefaultProcessorTask.java:568)
com.sun.enterprise.web.connector.grizzly.DefaultProcessorTask.process(DefaultProcessorTask.java:813)
com.sun.enterprise.web.connector.grizzly.DefaultReadTask.executeProcessorTask(DefaultReadTask.java:341)
com.sun.enterprise.web.connector.grizzly.DefaultReadTask.doTask(DefaultReadTask.java:263)
com.sun.enterprise.web.connector.grizzly.DefaultReadTask.doTask(DefaultReadTask.java:214)
com.sun.enterprise.web.portunif.PortUnificationPipeline$PUTask.doTask(PortUnificationPipeline.java:380)
com.sun.enterprise.web.connector.grizzly.TaskBase.run(TaskBase.java:265)
com.sun.enterprise.web.connector.grizzly.ssl.SSLWorkerThread.run(SSLWorkerThread.java:106)
The source code that results in the problem is attached (MessageFaultHandler.java-with_problem).
Customer also provided a modified source code that decreases the frequency of the problem (MessageFaultHandler.java-with_problem_less_frequent). The modified code where they walk the DOM tree in the
SOAPHandler doesn't eliminate the problem, it only makes it less likely to happen. It changed the problem from happening about 1 in 1000 transactions to 1 in 10000 transactions.
It seems that under high load there is an intermittent bug somewhere in the DOM->SOAP processing which happens less frequently if the DOM tree is walked by the SOAPHandler prior to SOAP related access.
In the attached java files note that the significant change that seems to allows them to reduce the frequency of the bug is the change from using SOAP tree access to DOM tree access, specifically request.getSOAPBody().getChildElements() changed to request.getSOAPBody().getChildNodes().
You can deploy the attached ear file(MethCheck.ear) into a SJSAS install to run the web service without any dependencies. Also attached is a command line client (WSTest.tgz) that you can use to load the web service and generate the error.
For this test the web service simply writes what it found to System.out. You can grep for the string "element is null" to find instances where the problem occurred:
users-jsterr-mac:/Applications/NetBeans/glassfish-v2/domains/domain1/logs jsterr$ grep "element is null" server*
server.log_2008-05-05T17-37-01:person-info element is null|#]
server.log_2008-05-05T17-37-04:person-info element is null|#]
To run the client you have to specify 3 parameters, the number of threads, the number of iterations for each thread,
and the WSDL URL. Here is an example of running it on the same server as SJSAS:
java -jar WSTest.jar 8 2000 http://localhost:8080/methCheckService/MethCheckWS?wsdl
The errors:
org.w3c.dom.DOMException: NOT_FOUND_ERR: An attempt is made to reference a node in a context where it does not exist.
AND
users-jsterr-mac:/Applications/NetBeans/glassfish-v2/domains/domain1/logs jsterr$ grep "element is null" server*
server.log_2008-05-05T17-37-01:person-info element is null|#]
server.log_2008-05-05T17-37-04:person-info element is null|#]
are independent. Customer believes one error or the other error happens depending on which way the tree gets messed up.
The problem is best reproducible by running the client and server on different machines, and both machines have multiple CPU cores so there was concurrency in both the client and server.
Customer also mentioned that based on further testing while trying to put together this example ear file it seems that the problem does not occur if there is no SOAPHandler. Customer has tested the full application with the SOAPHandlers removed. There were no errors after 62,000 iterations.
The SOAPHandlers can be removed with just an XML config change. Inside of MethCheck.ear is MethCheckEJB.jar. Inside the jar are 2 copies of the file MethCheckWS_handler.xml. If you replace the original files with the attached file (MethCheckWS_handler.xml) it will remove the handlers from the chain.
MethCheckWS_handler.xml with SOAPHandler
========================================
<?xml version="1.0" encoding="UTF-8"?>
<handler-chains xmlns="http://java.sun.com/xml/ns/javaee">
<handler-chain>
<handler>
<handler-name>com.appriss.methcheck.ws.WhitespaceMessageHandler</handler-name>
<handler-class>com.appriss.methcheck.ws.WhitespaceMessageHandler</handler-class>
</handler>
<handler>
<handler-name>com.appriss.methcheck.ws.MessageFaultHandler</handler-name>
<handler-class>com.appriss.methcheck.ws.MessageFaultHandler</handler-class>
</handler>
</handler-chain>
</handler-chains>
MethCheckWS_handler.xml without SOAPHandler
========================================
<?xml version="1.0" encoding="UTF-8"?>
<handler-chains xmlns="http://java.sun.com/xml/ns/javaee">
</handler-chains>
The customer believe that this an Appserver/Java bug since the handler is pretty simple, and when it gets an exception it is in code that is just trying to walk the tree. They can run without the handler as an interim solution but they will lose some logging by doing so. They would like to add it back in when a fix is available.
Please refer to CR 6700018 for the attachments.
- backported by
-
JDK-2163885 JDK 5 & 6 / Web service requests lose data under load
-
- Closed
-