While this question is tagged EventMachine, generic BSD-socket solutions in any language are much appreciated too.
I have an application listening on a TCP socket. It is started and shut down with a regular System V style init script.
My problem is that it needs some time to start up before it is ready to service the TCP socket. It's not too long, perhaps only 5 seconds, but that's 5 seconds too long when a restart needs to be performed during a workday. It's also crucial that existing connections remain open and are finished normally.
Reasons for a restart of the application are patches, upgrades, and the like. I unfortunately find myself in the position that, every once in a while, I need to do this kind of thing in production.
I'm looking for a way to do a neat hand-over of the TCP listening socket, from one process to another, and as a result get only a split second of downtime. I'd like existing connections / sockets to remain open and finish processing in the old process, while the new process starts servicing new connectinos.
Is there some proven method of doing this using BSD-sockets? (Bonus points for an EventMachine solution.)
Are there perhaps open-source libraries out there implementing this, that I can use as is, or use as a reference? (Again, non-Ruby and non-EventMachine solutions are appreciated too!)
If you want to pass a socket from one process to another, the only offically sanctioned way to do this is with Winsock 2's new WSADuplicateSocket() function.
You can't listen and connect on the same socket. A listen socket can only be used to accept connections and a connection socket can only be a single bi-directional pipe. Further, if you try to do the same action such as send in multiple threads, you risk interleaving data.
Two processes cannot bind to the same port at the same time - by default, anyway.
There are a couple of ways to do this with no downtime, with appropriate modifications to the server program.
One is to implement a restart capability in the server itself, for example upon receipt of a certain signal or other message. The program would then exec its new version, passing it the file descriptor number of the listening socket e.g. as an argument. This socket would have the FD_CLOEXEC
flag clear (the default) so that it would be inherited. Since the other sockets will continue to be serviced by the original process and should not be passed on to the new process, the flag should be set on those e.g. using fcntl()
. After forking and execing the new process, the original process can go ahead and close the listening socket without any interruption to the service, since the new process is now listening on that socket.
An alternative method, if you do not want the old server to have to fork and exec the new server itself, would be to use a Unix-domain socket to communicate between the old and new server process. A new server process could check for such a socket in a well-known location in the file system when it is starting. If present, the new server would connect to this socket and request that the old server transfer its listening socket as ancillary data using SCM_RIGHTS. An example of this is given at the end of cmsg(3).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With