I have an application which fetches messages from a ZeroMQ publisher, using a PUB/SUB setup. The reader is slow sometimes so I set a HWM on both the sender and receiver. I expect that the receiver will fill the buffer and jump to catch up when it recovers from processing slowdowns. But the behavior that I observe is that it never drops! ZeroMQ seems to be ignoring the HWM. Am I doing something wrong?
Here's a minimal example:
publisher.py
import zmq
import time
ctx = zmq.Context()
sock = ctx.socket(zmq.PUB)
sock.setsockopt(zmq.SNDHWM, 1)
sock.bind("tcp://*:5556")
i = 0
while True:
sock.send(str(i))
print i
time.sleep(0.1)
i += 1
subscriber.py
import zmq
import time
ctx = zmq.Context()
sock = ctx.socket(zmq.SUB)
sock.setsockopt(zmq.SUBSCRIBE, "")
sock.setsockopt(zmq.RCVHWM, 1)
sock.connect("tcp://localhost:5556")
while True:
print sock.recv()
time.sleep(0.5)
I believe there are a couple things at play here:
1
.PUB
HWM
will never drop messages... due to the way PUB
sockets work, it will always immediately processes the message whether there is an available subscriber or not. So unless it actually takes ZMQ .1 seconds to process the message through the queue, your HWM
will never come into play on the PUB
side.What should be happening is something like the following (I'm assuming an order of operations that would allow you to actually receive the first published message):
PUB
processes and sends the first message, SUB
receives and processes the first messagePUB
sleeps for .1 seconds and processes & sends the second messageSUB
sleeps for .5 seconds, the socket receives the second message but sits in queue until the next call to sock.recv()
processes itPUB
sleeps for .1 seconds and processes & sends the third messageSUB
is still sleeping for another .3 seconds, so the third message should hit the queue behind the second message, which would make 2 messages in the queue, and the third one should drop due to the HWM
... etc etc etc.
I suggest the following changes to help troubleshoot the issue:
HWM
on your publisher... it does nothing but add a variable we don't need to deal with in your test case, since we never expect it to change anything. If you need it for your production environment, add it back in and test it in a high volume scenario later.HWM
on your subscriber to 50. It'll make the test take longer, but you won't be at the extreme edge case, and since the ZMQ documentation states that the HWM isn't exact, the extreme edge cases could cause unexpected behavior. Mind you, I believe your test (being small numbers) wouldn't do that, but I haven't looked at the code implementing the queues so I can't say with certainty, and it may be possible that your data is small enough that your effective HWM
is actually larger.This of course doesn't address what happens if you still don't skip any messages... If you do that and still experience the same issue, then we may need to dig more into how high water marks actually work, there may be something we're missing.
I met exactly the same problem, and my demo is nearly the same with yours, the subscriber or publisher won't drop any message after either zmq.RCVHWM or zmq.SNDHWM is set to 1.
I walk around after referring to the suicidal snail pattern for slow subscriber detection in Chap.5 of zguide. Hope it helps.
BTW: would you please let me know if you've solved the bug of zmq.HWM ?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With