Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Socket.read() thread hanging between JBoss and ActiveMQ

Given
  • My Java app is a WAR deployed to JBoss (4.0.4GA)
  • Publishes and subscribes to an ActiveMQ (5.6.0) instance
  • Java app uses Apache Camel (2.10.3) for all integration (producing & consuming) with ActiveMQ
  • JBoss and ActiveMQ on their own (CentOS 5.6 Final) quad-core virtual servers, each virtual is on a different physical

I have a thread-hanging issue and am seeing the following in my thread dump:

java.net.SocketInputStream.socketRead0(Native Method)
java.net.SocketInputStream.read(SocketInputStream.java:129)
java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
java.io.BufferedInputStream.read(BufferedInputStream.java:317)
sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:687)
sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:632)
sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1195)
java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:379)
org.springframework.remoting.httpinvoker.SimpleHttpInvokerRequestExecutor.validateResponse(SimpleHttpInvokerRequestExecutor.java:146)
org.springframework.remoting.httpinvoker.SimpleHttpInvokerRequestExecutor.doExecuteRequest(SimpleHttpInvokerRequestExecutor.java:66)
org.springframework.remoting.httpinvoker.AbstractHttpInvokerRequestExecutor.executeRequest(AbstractHttpInvokerRequestExecutor.java:136)
org.springframework.remoting.httpinvoker.HttpInvokerClientInterceptor.executeRequest(HttpInvokerClientInterceptor.java:192)
org.springframework.remoting.httpinvoker.HttpInvokerClientInterceptor.executeRequest(HttpInvokerClientInterceptor.java:174)
org.springframework.remoting.httpinvoker.HttpInvokerClientInterceptor.invoke(HttpInvokerClientInterceptor.java:142)
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:202)
$Proxy117.SigmaCruxer(Unknown Source)
com.tms.SigmaClient.SigmaClient.processMessage(SigmaClient.java:46)
com.tms.SigmaClient.SigmaServiceEndpoint.doSigma(SigmaServiceEndpoint.java:29)
com.tms.SigmaClient.SigmaServiceEndpoint.mark(SigmaServiceEndpoint.java:43)
sun.reflect.GeneratedMethodAccessor193.invoke(Unknown Source)
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
java.lang.reflect.Method.invoke(Method.java:597)
org.apache.camel.component.bean.MethodInfo.invoke(MethodInfo.java:329)
org.apache.camel.component.bean.MethodInfo$1.proceed(MethodInfo.java:231)
org.apache.camel.component.bean.BeanProcessor.process(BeanProcessor.java:169)
org.apache.camel.util.AsyncProcessorHelper.process(AsyncProcessorHelper.java:104)
org.apache.camel.component.bean.BeanProcessor.process(BeanProcessor.java:74)
org.apache.camel.impl.ProcessorEndpoint.onExchange(ProcessorEndpoint.java:102)
org.apache.camel.impl.ProcessorEndpoint$1.process(ProcessorEndpoint.java:72)
org.apache.camel.impl.converter.AsyncProcessorTypeConverter$ProcessorToAsyncProcessorBridge.process(AsyncProcessorTypeConverter.java:50)
org.apache.camel.util.AsyncProcessorHelper.process(AsyncProcessorHelper.java:78)
org.apache.camel.processor.SendProcessor$2.doInAsyncProducer(SendProcessor.java:114)
org.apache.camel.impl.ProducerCache.doInAsyncProducer(ProducerCache.java:284)
org.apache.camel.processor.SendProcessor.process(SendProcessor.java:109)
org.apache.camel.util.AsyncProcessorHelper.process(AsyncProcessorHelper.java:78)
org.apache.camel.processor.DelegateAsyncProcessor.processNext(DelegateAsyncProcessor.java:98)
org.apache.camel.processor.DelegateAsyncProcessor.process(DelegateAsyncProcessor.java:89)
org.apache.camel.management.InstrumentationProcessor.process(InstrumentationProcessor.java:69)
org.apache.camel.util.AsyncProcessorHelper.process(AsyncProcessorHelper.java:78)
org.apache.camel.processor.DelegateAsyncProcessor.processNext(DelegateAsyncProcessor.java:98)
org.apache.camel.processor.DelegateAsyncProcessor.process(DelegateAsyncProcessor.java:89)
org.apache.camel.processor.interceptor.TraceInterceptor.process(TraceInterceptor.java:99)
org.apache.camel.util.AsyncProcessorHelper.process(AsyncProcessorHelper.java:78)
org.apache.camel.processor.RedeliveryErrorHandler.processErrorHandler(RedeliveryErrorHandler.java:318)
org.apache.camel.processor.RedeliveryErrorHandler.process(RedeliveryErrorHandler.java:209)
org.apache.camel.processor.DefaultChannel.process(DefaultChannel.java:305)
org.apache.camel.processor.UnitOfWorkProcessor.process(UnitOfWorkProcessor.java:102)
org.apache.camel.util.AsyncProcessorHelper.process(AsyncProcessorHelper.java:78)
org.apache.camel.processor.DelegateAsyncProcessor.processNext(DelegateAsyncProcessor.java:98)
org.apache.camel.processor.DelegateAsyncProcessor.process(DelegateAsyncProcessor.java:89)
org.apache.camel.management.InstrumentationProcessor.process(InstrumentationProcessor.java:69)
org.apache.camel.util.AsyncProcessorHelper.process(AsyncProcessorHelper.java:104)
org.apache.camel.processor.DelegateAsyncProcessor.process(DelegateAsyncProcessor.java:85)
org.apache.camel.component.jms.EndpointMessageListener.onMessage(EndpointMessageListener.java:91)
org.springframework.jms.listener.AbstractMessageListenerContainer.doInvokeListener(AbstractMessageListenerContainer.java:560)
org.springframework.jms.listener.AbstractMessageListenerContainer.invokeListener(AbstractMessageListenerContainer.java:498)
org.springframework.jms.listener.AbstractMessageListenerContainer.doExecuteListener(AbstractMessageListenerContainer.java:467)
org.springframework.jms.listener.AbstractPollingMessageListenerContainer.doReceiveAndExecute(AbstractPollingMessageListenerContainer.java:325)
org.springframework.jms.listener.AbstractPollingMessageListenerContainer.receiveAndExecute(AbstractPollingMessageListenerContainer.java:263)
org.springframework.jms.listener.DefaultMessageListenerContainer$AsyncMessageListenerInvoker.invokeListener(DefaultMessageListenerContainer.java:1058)
org.springframework.jms.listener.DefaultMessageListenerContainer$AsyncMessageListenerInvoker.executeOngoingLoop(DefaultMessageListenerContainer.java:1050)
org.springframework.jms.listener.DefaultMessageListenerContainer$AsyncMessageListenerInvoker.run(DefaultMessageListenerContainer.java:947)
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
java.lang.Thread.run(Thread.java:662)

According to these two artices: (here and here), my JBoss app has a blocking I/O operation on Socket.read() that is waiting for a completed response from a downstream service provider (in my case, ActiveMQ). Again, according to these articles, the culprit is one of the following 3 items:

  • ActiveMQ is in an unhealthy/unstable state and is responding too slowly, causing my listening/waiting/blocking threads to hang; or
  • The ActiveMQ instance itself is fine, but is processing an operation (writing to KahaDB, etc.) that is taking too long to complete, again causing my threads to hang; or
  • There are networking issues between my JBoss app (WAR) and my ActiveMQ instance.

I'm trying to figure out which of the three is the case. Is there anything in that thread dump to indicate which one it is? My understanding (after reading those articles) is that the real hang is the fact that the client-side (blocking) socket has just not received all the bytes it needs to consider the response complete; meaning it hasn't received any response from ActiveMQ, or it just hasn't received a full response.

So I ask:

  1. Is there a clear indication of which of the 3 scenarios is the case? If so, what/why? If not, what should my next step be (I am also the "admin" who set up ActiveMQ so I have full access to it as well as JBoss and the WAR deployed to it).
  2. Would upgrading to a newer of JBoss fix this? Perhaps 4.0.4GA is using the "old" (blocking) Java I/O, whereas newer versions might use NIO? Probably a long-shot but can't discredit it just yet.
  3. Both articles stress that proper socket-timeout configuration should be implemented which may very well mitigate all of this (although it doesn't address the underlying ActiveMQ unresponsiveness and/or networking issues):
    1. Is this a timeout I would write in my Java code? If so how and with what API? JMS? Some ActiveMQ client-side jar?
    2. Is this a timeout I implement at the OS level? If so I'm not sure how to proceed...
    3. Is this a timeout I implement on the server-side (ActiveMQ)? If so, how?

I think I'm closing in on a solution here, but kind of stuck and having a tough time seeing the forest through the trees. Thanks in advance!

like image 597
IAmYourFaja Avatar asked Mar 08 '13 19:03

IAmYourFaja


1 Answers

I have some experience with JBoss (and Glassfish), and ActiveMQ, but I've never used Camel before (but am familiar with Mule, which I read is similar).

Your stack trace looks like it's Camel trying to link ActiveMQ (JMS-stuff on bottom of trace) to a web endpoint (HTTP-stuff on top of trace).

I'm a bit confused as to where Camel is running (the CamelContext). You said that you have two virtual machines, one for JBoss and one for ActiveMQ. In my case, we run Mule ESB on the machine with our ActiveMQ. Where is your Camel running?

Your stack trace appears most like Problem #1 from the first post. It's as if the Camel part cannot "see" the web endpoint. Check to see that your WAR is deployed correctly, and that your web endpoint (WSDL) is visible from both virtual machines. Check your endpoints; maybe one is set to localhost or something, which would not allow it to get to another machine.

It's a bit hard to tell if it's an incomplete read or a complete inability to read. Does any data get through? It's possible that the Web Server is slowly getting overloaded and cannot keep up with requests (and starves some threads as in your error). Socket timeouts become important when you have slow responses or many requests; if you can create a test that is simple (fast and with few requests) then you can at least verify that you have basic connectivity. What data input (test) caused this error?

I'll be happy to try to improve this answer given more input. (I'm sorry I would've tried commenting on your question but I don't think I have the rep for that yet...)

like image 179
SeKa Avatar answered Oct 06 '22 21:10

SeKa