Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

the way to synchronize state between distributed clients

Here is my problem:

My application is a distributed real-time message broker for web applications. Clients from web-browsers connect to one of the application nodes. Those nodes connected by ZeroMQ PUB/SUB mechanism. If one client sends message - node publishes it into PUB socket, other nodes receive those message from SUB socket and send it to their own connected clients.

But now I need presence and history functionality. Presence - provide a list, containing description of all connected (to all nodes) clients. History - provide a list of last several messages sent. I.e. I need to get entire state of application. I consider several ways to achieve it:

1) Send all information about connected clients to central server. Then when a client asks for presence - ask central server and return response to client.

2) Keep all information on every node. When client connect to any node send information about it to other nodes - using PUBLISH operation. So when a client asks for presence I can immediately return a response.

3) Gather information on demand from all nodes. I really can’t imagine how to program this at moment but this allows to get rid of duplicating information that leads to reducing memory consuption. In this case I don’t need to worry about fitting all information in memory.

4) Use some distributed data store, something like Dooserd. But I don’t like this idea because of extra dependency.

Client needs presence information on every connect to the node, presence information changes on every client's connect/disconnect, history information changes on every message.

This is an open-source application, so I don't know how much connected clients it must support. Load tests in the end will say this number.

There is no strong requirement about reliability of those presence and history data.

I really need your advice, which of these options is the right way to solve my problem. Or maybe there is another better way?

like image 971
Alex Emelin Avatar asked Sep 18 '25 02:09

Alex Emelin


2 Answers

The presence and history data is quite naturally divided along the lines of the channel it pertains to.

So have you considered distributing the channels across the application servers? Each application node could have a couple channels it knows about. Queries about other channels get sent to the specific nodes that can answer them.

This comes probably closest to option 3 in your list.

This way each channel's presence data becomes a manageable chunk of data, probably small enough to keep in memory. The history data could be cached in memory, also on the channel-specific server. Use some kind of eviction algorithm to determine which history data is no longer interesting enough to cache. It gets removed from memory and sits ready to be retrieved from storage.


Another idea for your consideration: Do you know 0MQ's Clustered Hashmap Protocol? I think you could use it (or be inspired by it) to push presence and history data about those channels a client is connected to, instead of having them pull it from the application servers.


EDIT: I read up on the CHP protocol, it had been a while since I read the guide.

The CHP server publishes all changes in the hashmap data to all subscribing clients. The subscribers filter the data. This is how subscribing to 0MQ topics works, not just CHP. But it may turn out to be a bit much data for your clients to munch if a server hosts many channels but clients generally are interested in only a few channels.

You are facing this problem already, I think, so I wonder: How do you organize this now?

The snapshot is retrieved by the clients when they join, and it is filtered based on the subtree. There's interesting details in the user guide on how to keep the published updates on the queue until the snapshot arrives and how to toss away the messages that predate the update.

So we will do the synchronization in the client, as follows:

  • The client first subscribes to updates and then makes a state request. This guarantees that the state is going to be newer than the oldest update it has.

  • The client waits for the server to reply with state, and meanwhile queues all updates. It does this simply by not reading them: ØMQ keeps them queued on the socket queue.

  • When the client receives its state update, it begins once again to read updates. However, it discards any updates that are older than the state update. So if the state update includes updates up to 200, the client will discard updates up to 201.

  • The client then applies updates to its own state snapshot.

I think this bit will definitely be of interest to you.

like image 77
flup Avatar answered Sep 19 '25 16:09

flup


Basically all your options are valid options in certain circumstances.

Without specific requirements I would go with the simplest solution.

I think the simplest solution is to use something like Redis. It is stable, used by many companies (including SO to my knowledge), it is very fast and pretty flexible, it is easy to implement capped lists for history. It will be pretty easy to iterate on your requirements, because you can change functionality quickly.

Another option if you don't want extra dependency/deployment is to partition information between your servers (using hash partitioning or consistent hashing) so you know where to store/retrieve information about particular client or another entity.

HTH

like image 21
Sergey Zyuzin Avatar answered Sep 19 '25 17:09

Sergey Zyuzin