Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Scheduled tasks in cluster using zookeeper

We use Spring to run scheduled tasks which works fine with single node. We want to run these scheduled tasks in cluster of N nodes such that the tasks are executed atmost by one node at a point of time. This is for enterprise use case and we may expect upto 10 to 20 nodes.

I looked into various options:

  1. Use Quartz which seems to be a popular choice for running scheduled tasks in a cluster. Drawback: Database dependency which I want to avoid.
  2. Use zookeeper and always run the scheduled tasks only on the leader/master node. Drawback: Task execution load is not distributed
  3. Use zookeeper and have the scheduled tasks invoke on all nodes. But before the task runs acquire distributed lock and release once execution is complete. Drawback: The system clock on all nodes should be in sync which may be an issue if application is overloaded causing system clock drift.
  4. Use zookeeper and let the master node keep producing the task as per the schedule and assign it to a random worker. A new task is not assigned if previous scheduled task has not been worked on yet. Drawback: This appears to add too much complexity.

I am inclining towards using #3 which appears to be a safe solution assuming the zookeeper ensemble nodes run on a separate cluster with system clock in sync using NTP. This is also on assumption that if system clocks are in sync, then all nodes have equal chance of acquiring the lock to execute a task.
EDIT: After some more thought I realize this may not be a safe solution either since the system clock should be in sync between the nodes where the scheduled tasks are running not just the zookeeper cluster nodes. I am saying not safe because the nodes where the tasks are running can be overloaded with GC pauses and other reasons and there is possibility of clocks going out of sync. But again I would think this is a standard problem with distributed systems.

Could you please advise if my understanding on each of the options is accurate? Or may be there is a better approach than the listed options to solved this problem.

like image 939
vkorimilli Avatar asked Dec 15 '16 06:12

vkorimilli


1 Answers

Well, you can improve the #3 like this.

Zookeeper provide watchers. That is, you can set a watcher on a given ZNode (say at path /some/path). All your nodes in the cluster are watching the same Znode. Whenever a node thinks(as scheduled or whatever way) it should now run the scheduled task,

  1. First it create a PERSISTENT_SEQUENTIAL child node under /some/path (which all the nodes are watching). Also, you can set the data of that node as you wish. It may be a json string specifying the details about the task to be run. The new ZNode path will look like /some/path/prefix_<sequence-number>.
  2. Then, all the nodes in the cluster will be notified about the child node created. All of them then fetch the newly created ZNode's data and decode the task.
  3. Now, each node try to acquire a distributed lock. Whoever acquiring it first can execute it. Once executed, that node should report (Say by creating a new ZNode under /some/path/prefix_<sequence-number> with name success), that that task was executed. Then release the lock.
  4. Whenever a node is trying to execute a task, before trying to acquire the distributed lock, it should check if that ZNode already has a success child node.

This design ensures that no task is run twice by checking the child node with name success under a given ZNode created to notify to start a task.

I have used the above design for an enterprise solution. Actually for a distributed command framework ;-)

like image 100
Imesha Sudasingha Avatar answered Sep 22 '22 06:09

Imesha Sudasingha