Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Realtime Socket.IO scaling problem - python

I'm trying to do something like the stream on Facebook, with socket.io 0.6 and tornadio.

Each user has is own comet channel/group in his wall. I'm sending a comet message to the wall of all my friends (even if they aren't online).

The problem is about scaling: what if i have 1 million friends? It would take a long time to write in all walls.

Is there any solution more efficient to do this using comet?

like image 850
Tiago Moutinho Avatar asked Nov 20 '25 00:11

Tiago Moutinho


1 Answers

This is a difficult problem in the social space. There is a trade-off between two approaches:

  • push: When a user produces an event (e.g. a status update), you push that status update out to the stream of each of the user's friends. When a user loads his or her stream, you only have to read a record from a single place.
  • pull: When a user produces an event, you write that even to the user's data record. When a user loads his stream, you poll the data record of each of his friends, aggregating the results on the fly.

The push method is good when loading a stream happens much more often than user updates and when the "fanout" of users (e.g. the maximum number of followers a user has) is low. The pull method is good when a user loading his stream is rare, or if the the number of users a user can follow is low.

I co-authored a paper on how to do this efficiently. Basically, we used a hybrid method, determining when to push or pull based on user statistics.

For simplicity, I would recommend you implement the pull model. Cache the results of the aggregation and only refresh a user's feed after the cache entry is stale for a certain period of time.

like image 75
jterrace Avatar answered Nov 22 '25 14:11

jterrace