Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to design task distribution with ZooKeeper

I am planning to write an application which will have distributed Worker processes. One of them will be Leader which will assign tasks to other processes. Designing the Leader elelection process is quite simple: each process tries to create a ephemeral node in the same path. Whoever is successful, becomes the leader.

Now, my question is how to design the process of distributing the tasks evenly? Any recipe for this?

I'll elaborate a little on the environment setup:

Suppose there are 10 worker maschines, each one runs a process, one of them become leader. Tasks are submitted in the queue, the Leader takes them and assigns to a worker. The worker processes gets notified whenever a tasks is submitted.

like image 410
Sabya Avatar asked Mar 07 '11 09:03

Sabya


People also ask

What is ZooKeeper distributed?

ZooKeeper is a distributed, open-source coordination service for distributed applications. It exposes a simple set of primitives that distributed applications can build upon to implement higher level services for synchronization, configuration maintenance, and groups and naming.

What is ZooKeeper system design?

Design Goals ZooKeeper allows distributed processes to coordinate with each other through a shared hierarchical namespace which is organized similarly to a standard file system. The name space consists of data registers - called znodes, in ZooKeeper parlance - and these are similar to files and directories.

How do you use ZooKeeper leader elections?

Leader Election. A simple way of doing leader election with ZooKeeper is to use the SEQUENCE|EPHEMERAL flags when creating znodes that represent "proposals" of clients. The idea is to have a znode, say "/election", such that each znode creates a child znode "/election/guid-n_" with both flags SEQUENCE|EPHEMERAL.

Is ZooKeeper a database?

In quorum mode, a group of ZooKeeper servers--each of which maintains an in-memory database containing the entire data tree of state as well as a transaction log and snapshots stored persistently--replicates its state and serves client requests. This group of servers is called an ensemble.


1 Answers

I am not sure I understand your algorithm for Leader election, but the recommended way of implementing this is to use sequential ephemeral nodes and use the algorithm at http://zookeeper.apache.org/doc/r3.3.3/recipes.html#sc_leaderElection which explains how to avoid the "herd" effect.

Distribution of tasks can be done with a simple distributed queue and does not strictly need a Leader. The producer enqueues tasks and consumers keep a watch on the tasks node - a triggered watch will lead the consumer to take a task and delete the associated znode. There are certain edge conditions to consider with requeuing tasks from failed consumers. http://zookeeper.apache.org/doc/r3.3.3/recipes.html#sc_recipes_Queues

like image 144
manku Avatar answered Oct 05 '22 23:10

manku