Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does the operating system load balance between multiple processes accepting the same socket?

I'm reading the docs of the cluster module in Node.js:
http://nodejs.org/api/cluster.html

It claims the following:

When multiple processes are all accept()ing on the same underlying resource, the operating system load-balances across them very efficiently.

This sounds reasonable, but even after a couple of hours of googling, I haven't found any article or anything at all which would confirm it, or explain how this load balancing logic works in the operating system.

Also, what operating systems are doing this kind of effective load balancing?

like image 527
Venemo Avatar asked Sep 19 '12 12:09

Venemo


People also ask

Can multiple processes use the same socket?

Then both processes could use the same socket, just like two threads of a multithreaded program do. Open files are not per-process objects; any file can be shared between multiple processes.

How Does load balancing Work?

A load balancer acts as the “traffic cop” sitting in front of your servers and routing client requests across all servers capable of fulfilling those requests in a manner that maximizes speed and capacity utilization and ensures that no one server is overworked, which could degrade performance.

How many connections can a load balancer handle?

Your load balancer uses these IP addresses to establish connections with the targets. Depending on your traffic profile, the load balancer can scale higher and consume up to a maximum of 100 IP addresses distributed across all enabled subnets.

What is load balancing in power system?

Load balancing in distribution system is defined as maintaining the load currents approximately same on the three phases of the power system distribution network throughout the running time.


1 Answers

"Load balancing" is perhaps a bit poor choice of words, essentially it's just a question of how does the OS choose which process to wake up and/or run next. Generally, the process scheduler tries to choose the process to run based on criteria like giving an equal share of cpu time to processes of equal priority, cpu/memory locality (don't bounce processes around the cpu's), etc. Anyway, by googling you'll find plenty of stuff to read about process scheduling algorithms and implementations.

Now, for the particular case of accept(), that also depends on how the OS implements waking up processes that are waiting on accept().

  • A simple implementation is to just wake up every process blocked on the accept() call, then let the scheduler choose the order in which they get to run.

  • The above is simple but leads to a "thundering herd" problem, as only the first process succeeds in accepting the connection, the others go back to blocking. A more sophisticated approach is for the OS to wake up only one process; here the choice of which process to wake up can be made by asking the scheduler, or e.g. just by picking the first process in the blocked-on-accept()-for-this-socket list. The latter is what Linux does since a decade or more back, based on the link already posted by others.

  • Note that this only works for blocking accept(); for non-blocking accept() (which I'm sure is what node.js is doing) the issue becomes to which process blocking in select()/poll()/whatever to deliver the event to. The semantics of poll()/select() actually demand that all of them be waken up, so you have the thundering herd issue there again. For Linux, and probably in similar ways other systems with system-specific high performance polling interfaces as well, it's possible to avoid the thundering herd by using a single shared epoll fd, and edge triggered events. In that case the event will be delivered to only one of the processes blocked on epoll_wait(). I think that, similar to blocking accept(), the choice of process to deliver the event to, is just to pick the first one in the list of processes blocked on epoll_wait() for that particular epoll fd.

So at least for Linux, both for the blocking accept() and the non-blocking accept() with edge triggered epoll, there is no scheduling per se when choosing which process to wake. But OTOH, the workload will probably be quite evenly balanced between the processes anyway, as essentially the system will round-robin the processes in the order in which they finish their current work and go back to blocking on epoll_wait().

like image 133
janneb Avatar answered Oct 06 '22 00:10

janneb