I am using RabbitMQ to have worker processes encode video files. I would like to know when all of the files are complete - that is, when all of the worker processes have finished.
The only way I can think to do this is by using a database. When a video finishes encoding:
UPDATE videos SET status = 'complete' WHERE filename = 'foo.wmv'
-- etc etc etc as each worker finishes --
And then to check whether or not all of the videos have been encoded:
SELECT count(*) FROM videos WHERE status != 'complete'
But if I'm going to do this, then I feel like I am losing the benefit of RabbitMQ as a mechanism for multiple distributed worker processes, since I still have to manually maintain a database queue.
Is there a standard mechanism for RabbitMQ dependencies? That is, a way to say "wait for these 5 tasks to finish, and once they are done, then kick off a new task?"
I don't want to have a parent process add these tasks to a queue and then "wait" for each of them to return a "completed" status. Then I have to maintain a separate process for each group of videos, at which point I've lost the advantage of decoupled worker processes as compared to a single ThreadPool concept.
Am I asking for something which is impossible? Or, are there standard widely-adopted solutions to manage the overall state of tasks in a queue that I have missed?
Edit: after searching, I found this similar question: Getting result of a long running task with RabbitMQ
Are there any particular thoughts that people have about this?
In php-amqlib: $channel->basic_get(QUEUE_NAME, true); // the second arg is no_ack . The second argument marks that no acknowledgment is expected for that message. That is, you don't have to "flag" the message as read for RabbitMQ to confidently dequeue it.
In case RabbitMQ receive non-routable message it drop it. So while message was received, it was not queued. You may configure Alternate Exchanges to catch such messages. Save this answer.
The passive argument to queue_declare allows you to check whether a queue exists without modifying the server state, so you can use queue_declare with the passive option to check the queue length. This should be the accepted answer, even if he did miss Basic. Get as the 2nd source of this info.
Use a "response" queue. I don't know any specifics about RabbitMQ, so this is general:
numSent == numResponded
, you're doneSomething to keep in mind is a timeout -- What happens if a child process dies? You have to do slightly more work, but basically:
This is called the Request Reply Pattern.
Based on Brendan's extremely helpful answer, which should be accepted, I knocked up this quick diagram which be helpful to some.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With