We have a Pub / Sub system based on NServiceBus, where we have intermittent issues with messages getting stuck on the Publishers outgoing queue indefinitely, rather than being transmitted to the Subscribers input queues.
Points to note:
Annoyingly our development environment does not exhibit this behaviour, but then again the publisher and subscribers all reside on the same LAN in this environment.
MSMQ messages being stuck in an outgoing queue is purely an MSMQ issue.
Restarting the Publisher and Subscriber services should make no difference as they are not directly involved in message delivery. If you can fix the problem by ONLY restarting the Pub/Sub services and NOT the Message Queuing services then it looks like a resources/memory leak problem.
I imagine something like this happening:
Occasional messages get through when just enough kernel memory is temporarily freed up by one of the many services and device drivers that use it.
Item 4 of this blog post is the most likely culprit: http://blogs.msdn.com/b/johnbreakwell/archive/2006/09/18/insufficient-resources-run-away-run-away.aspx
Cheers
John Breakwell
We had a similar scenario in production, it turned out we migrated one of our subscriber endpoints to a new physical host and forgot to unsubscribe before shutting down the old endpoint. Our publisher was trying to deliver messages to both the old and new endpoints but could only reach the new one. Eventually the publishers outbound queue grew so large that it started affecting all outgoing messages.
I have run into this issue as well, I know it is not Item 4, as I don't send anything to it before it gets stuck in the outgoing queue. If I let both publisher and subscriber sit for about 10 minutes before sending a message, it never leaves the outgoing queue. If I send a message before that amount of time, it flows fine. Also, if I restart the subscriber the message will then flow. This is reproducible every time I let them sit idle for 10 minutes.
I think I found the answer here, at least this fixed the issue I was having:
http://support.microsoft.com/kb/2554746
Also, in my case it had nothing to do with restarting, so don't let that throw you off, I did exhibit the symptoms in the netstat and messages would initially go through when the client was first started up.
Just to throw my 2p in:
We had an issue where the message queuing service had some kind of memory leak and would consume large amounts of memory which is did not release.
This lead to messages getting stuck for long periods of time - although they would eventually be delivered (sometimes after 3 days).
We have not bothered fixing this yet as it only happens when the service is under heavy load which does not happen often.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With