Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to design a distributed node.js web server

Supose I need to implement a web application that will have a high volume of concurrent users. I decide to use node.js because it scales very well, it has good performance, open source community, etc, etc. Then, to avoid bottlenecks because I could have gazillions of users in the same event loop I decide to use a cluster of processes to take advantage of the multi-core CPU. Furthermore, I have 3 machines (main + 2) because I need to manipulate big-data with Cassandra. Awesome, this means I have 3*n node.js processes where n is the number of cores of the cpu (machines are identical).

Ok, then I start a research and I end with the following schema:

  • Nginx listening on port 80 and used only to serve static content (img, css, js, etc).
    Forwards the dynamic traffic to haproxy. I know how to configure nginx but I still have to take a look to haproxy, so I'll say that haproxy is listening on port 4000. Nginx and haproxy are installed in the main machine (the entry point).
  • Haproxy load balances between the 3 machines. It forwards traffic to port 4001, that is, the node.js processes are listening to 4001.
  • Every node.js has a cluster of n processes listening to 4001.

If I'm correct a single http request will be forwarded to a single node.js process.

Creating a session is quite normal, right? A session is just a map, and this map is an Object, and this Object lives in a node.js process. Haproxy will be configured with a round-robin scheduler, so the same user can be forwarded to different node.js processes. How can I share the same session object across all the node.js processes? How can I share a global object (this includes in the same machine (node.js cluster) and across the network)? How should I design a distributed web app with node.js? Are there any modules that ease the synchronization tasks?

like image 279
Gabriel Llamas Avatar asked Nov 13 '12 02:11

Gabriel Llamas


2 Answers

You could use memcache or redis for storing session objects. It's quite useful in case of restart node processes (if session data is stored in process' memory it will be lost).

Also you can inspect pm2 feature list and maybe some of them would be useful for you.

Building micro-services architecture will make good scalability.

like image 143
Ivan Pesochenko Avatar answered Oct 20 '22 16:10

Ivan Pesochenko


As Ivan pointed out, you store your session objects in memcache or redis or even Couchbase (memcache buckets). I'd also like to add, if you want to build a scalable system, your goal should be to build system in a way that you can scale linearly to increase throughput based on demand. By that I mean, you should be able to add more hosts at any time (preferably during peak) to different tiers within your infrastructure to handle demands.

So you have to be very careful in what technology you pick and the design decisions you make during development.

Supose I need to implement a web application that will have a high volume of concurrent users.

Another thing I'd like to add, if you can't measure it you cannot manage it. A good start would to define what "high volume of concurrent users" mean to you? Is that facebook or whatsApp type of volume/concurrency? Define these first by working with your stakeholders (if any) then you can start to go about making design decisions and picking technology.

A good litmus test while building a scalable system is to ask yourself, "Is there a single point of failure?" if yes then your system wouldn't scale.

like image 23
na-98 Avatar answered Oct 20 '22 17:10

na-98