I'm reading the docs of the cluster module in Node.js: http://nodejs.org/api/cluster.html It claims the following: <blockquote> When multiple processes are all <code>accept()</code>ing on the same underlying resource, the operating system load-balances across them very efficiently. </blockquote> This sounds reasonable, but even after a couple of hours of googling, I haven't found any article or anything at all which would confirm it, or explain how this load balancing logic works in the operating system. Also, what operating systems are doing this kind of effective load balancing?

"Load balancing" is perhaps a bit poor choice of words, essentially it's just a question of how does the OS choose which process to wake up and/or run next. Generally, the process scheduler tries to choose the process to run based on criteria like giving an equal share of cpu time to processes of equal priority, cpu/memory locality (don't bounce processes around the cpu's), etc. Anyway, by googling you'll find plenty of stuff to read about process scheduling algorithms and implementations. Now, for the particular case of accept(), that also depends on how the OS implements waking up processes that are waiting on accept(). <ul> <li>A simple implementation is to just wake up every process blocked on the accept() call, then let the scheduler choose the order in which they get to run. </li> <li>The above is simple but leads to a "thundering herd" problem, as only the first process succeeds in accepting the connection, the others go back to blocking. A more sophisticated approach is for the OS to wake up only one process; here the choice of which process to wake up can be made by asking the scheduler, or e.g. just by picking the first process in the blocked-on-accept()-for-this-socket list. The latter is what Linux does since a decade or more back, based on the link already posted by others.</li> <li>Note that this only works for blocking accept(); for non-blocking accept() (which I'm sure is what node.js is doing) the issue becomes to which process blocking in select()/poll()/whatever to deliver the event to. The semantics of poll()/select() actually demand that all of them be waken up, so you have the thundering herd issue there again. For Linux, and probably in similar ways other systems with system-specific high performance polling interfaces as well, it's possible to avoid the thundering herd by using a single shared epoll fd, and edge triggered events. In that case the event will be delivered to only one of the processes blocked on epoll_wait(). I think that, similar to blocking accept(), the choice of process to deliver the event to, is just to pick the first one in the list of processes blocked on epoll_wait() for that particular epoll fd.</li> </ul> So at least for Linux, both for the blocking accept() and the non-blocking accept() with edge triggered epoll, there is no scheduling per se when choosing which process to wake. But OTOH, the workload will probably be quite evenly balanced between the processes anyway, as essentially the system will round-robin the processes in the order in which they finish their current work and go back to blocking on epoll_wait().

How does the operating system load balance between multiple processes accepting the same socket?

Tags:

node.js

networking

concurrency

sockets

I'm reading the docs of the cluster module in Node.js:
http://nodejs.org/api/cluster.html

It claims the following:

When multiple processes are all accept()ing on the same underlying resource, the operating system load-balances across them very efficiently.

This sounds reasonable, but even after a couple of hours of googling, I haven't found any article or anything at all which would confirm it, or explain how this load balancing logic works in the operating system.

Also, what operating systems are doing this kind of effective load balancing?

527

asked Sep 19 '12 12:09

Venemo

1 Answers

"Load balancing" is perhaps a bit poor choice of words, essentially it's just a question of how does the OS choose which process to wake up and/or run next. Generally, the process scheduler tries to choose the process to run based on criteria like giving an equal share of cpu time to processes of equal priority, cpu/memory locality (don't bounce processes around the cpu's), etc. Anyway, by googling you'll find plenty of stuff to read about process scheduling algorithms and implementations.

Now, for the particular case of accept(), that also depends on how the OS implements waking up processes that are waiting on accept().

A simple implementation is to just wake up every process blocked on the accept() call, then let the scheduler choose the order in which they get to run.
The above is simple but leads to a "thundering herd" problem, as only the first process succeeds in accepting the connection, the others go back to blocking. A more sophisticated approach is for the OS to wake up only one process; here the choice of which process to wake up can be made by asking the scheduler, or e.g. just by picking the first process in the blocked-on-accept()-for-this-socket list. The latter is what Linux does since a decade or more back, based on the link already posted by others.
Note that this only works for blocking accept(); for non-blocking accept() (which I'm sure is what node.js is doing) the issue becomes to which process blocking in select()/poll()/whatever to deliver the event to. The semantics of poll()/select() actually demand that all of them be waken up, so you have the thundering herd issue there again. For Linux, and probably in similar ways other systems with system-specific high performance polling interfaces as well, it's possible to avoid the thundering herd by using a single shared epoll fd, and edge triggered events. In that case the event will be delivered to only one of the processes blocked on epoll_wait(). I think that, similar to blocking accept(), the choice of process to deliver the event to, is just to pick the first one in the list of processes blocked on epoll_wait() for that particular epoll fd.

So at least for Linux, both for the blocking accept() and the non-blocking accept() with edge triggered epoll, there is no scheduling per se when choosing which process to wake. But OTOH, the workload will probably be quite evenly balanced between the processes anyway, as essentially the system will round-robin the processes in the order in which they finish their current work and go back to blocking on epoll_wait().

133

answered Oct 06 '22 00:10

janneb

Related questions
                            
                                Check for incoming data in Java Socket
                            
                                Java <-> C Bridge
                            
                                Do I need a PORT when joining a multicast group or just the IP?
                            
                                Sending an ArrayList<String> from the server side to the client side over TCP using socket?
                            
                                ZeroMQ securely over the internet
                            
                                How to know the end of FTP Welcome message
                            
                                Zookeeper Network Ensemble does not start appropiately
                            
                                How can I determine if a non-blocking socket is really connected?
                            
                                Decode video in Cuda using a socket / memory instead of a file
                            
                                HTTP multipart/form-data. What happends when binary data has no string representation?
                            
                                Good Client Socket Pool
                            
                                How do I save sockets in a hash and loop over them from another thread?
                            
                                How to hand-over a TCP listening socket with minimal downtime?
                            
                                TcpListener is queuing connections faster than I can clear them
                            
                                java.nio vs new thread for each socket [closed]
                            
                                best way to start learning socket programming in objective c [closed]
                            
                                Is there a non blocking method for host resolution in WINAPI?
                            
                                Will send() ever block when using select()?
                            
                                Linux Asynch IO - difference between aio.h and libaio.h
                            
                                Homework - Python Proxy Server

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With