Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What's the best approach to schedule tasks across docker cluster?

Currently, I have an app running on one server. There's a crontab set up so according to specified rules, there are tasks being run at certain times.

Now, I'm thinking about migrating my app into docker container so I'm able to run multiple instances of my app independently. The thing I'm wondering how to do is How to schedule tasks across multiple docker containers.

Let's say I have a php command that every hour fetches new data from 3rd party app via API. Currently, I would use a cron to schedule it like this: 0 */1 * * * php /some/path/index.php mycommand. There can be multiple similar commands launched at different frequencies.

I cannot simply pack crontab into my docker image as the command would be launched 5 times when there are 5 containers running. I want to launch it only once independently on running containers count.

What would be the ideal solution to achieve this?

like image 997
simPod Avatar asked Apr 07 '17 14:04

simPod


2 Answers

You could use a locking mechanism, using something like redis. Basically it would work like this.

script wakes up, first thing it does, is try and get the lock. If it gets the lock, then it moves forward, if something else has the lock, then exit. Do what the script does, and then release the lock.

Since only one script can get the lock at a time, it will only allow the script to run once.

It is important to remove the lock when the script is done, and also add a TTL to the lock so that if the script dies before releasing the lock, the lock will automatically open up after the TTL expires.

Here is some docs on how to use Redis as a distributed lock. https://redis.io/topics/distlock

like image 133
Ken Cochrane Avatar answered Sep 20 '22 06:09

Ken Cochrane


A simple strategy in your case would be to use containers with separate roles. For example, instead of using 5 containers responding to HTTP requests AND running cron, you could have 4 containers running your application only and one exclusively for cron jobs.

If you ever need to scale up your cron jobs adding more nodes, then you'll need a distributed queue/lock solution as @Ken Cochrane described.

like image 23
Andre Bernardes Avatar answered Sep 21 '22 06:09

Andre Bernardes