Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ZeroMQ: Publish to multiple Workers and wait for ACK

Tags:

zeromq

I'm working on an app that notifies multiple Workers when a reboot is about to happen and then waits for ALL the Workers to do perform some tasks then send an ACK before rebooting. The number of Workers can change so somehow my app will need to know how many Workers are currently subscribed so that it knows that every Worker has sent an ACK.

Is a pub/sub approach the best way for doing this? Does it provide a way to figure out how many subscribers are currently connected? Should my app use a REP socket to listen for ACK from the Workers? Is there a more elegant way of designing this?

Thanks

like image 335
philipdotdev Avatar asked Dec 04 '13 17:12

philipdotdev


1 Answers

Is a pub/sub approach the best way for doing this?

Using pub/sub from the server to broadcast a "server reboot" message is fine for the workers who get the message, but it's not full-proof. Slow-joiner syndrome may prevent a worker (or workers) from receiving the message. To address that, the server, once it publishes a reboot message, should continue publishing that message until all workers respond with ACK, but that creates a new problem: how does the server keep track of all workers to ensure it receives all necessary ACK's?

Does it provide a way to figure out how many subscribers are currently connected?

No. Exposing that information breaks ZeroMq's abstraction model which hides the physical details of the connection and connected peers. You can send heartbeat messages periodically from server to workers over pub/sub; workers respond with a logical node id (WorkerNode1, etc), and the server keeps track of each worker in a hashtable along with a future expiration time. When a worker responds to a hearbeat, the server simply resets the future expiration for that worker; the server should periodically check the hashtable and remove expired workers.

That's the best you can do for keeping track of workers. The shorter the expiration, the more accurate the worker list reflects.

Should my app use a REP socket to listen for ACK from the Workers? Is there a more elegant way of designing this?

REQ/REP sockets have limited uses. I'd use PUB on the server for sending reboot and heartbeat messages; ROUTER to receive ACK's. The workers should use DEALER for sending ACK's (and anything else), and SUB for receiving heartbeats/reboots. ROUTER and DEALER are bi-directional and fully asynchronous, and the most versatile; can't go wrong.

Hope it helps!

like image 115
raffian Avatar answered Oct 21 '22 09:10

raffian