Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to know whether any process is bound to a Unix domain socket?

I'm writing a Unix domain socket server for Linux.

A peculiarity of Unix domain sockets I quickly found out is that, while creating a listening Unix socket creates the matching filesystem entry, closing the socket doesn't remove it. Moreover, until the filesystem entry is removed manually, it's not possible to bind() a socket to the same path again : bind() fails with EADDRINUSE if the path it is given already exists in the filesystem.

As a consequence, the socket's filesystem entry needs to be unlink()'ed on server shutdown to avoid getting EADDRINUSE on server restart. However, this cannot always be done (i.e.: server crash). Most FAQs, forum posts, Q&A websites I found only advise, as a workaround, to unlink() the socket prior to calling bind(). In this case however, it becomes desirable to know whether a process is bound to this socket before unlink()'ing it.

Indeed, unlink()'ing a Unix socket while a process is still bound to it and then re-creating the listening socket doesn't raise any error. As a result, however, the old server process is still running but unreachable : the old listening socket is "masked" by the new one. This behavior has to be avoided.

Ideally, using Unix domain sockets, the socket API should have exposed the same "mutual exclusion" behavior that is exposed when binding TCP or UDP sockets : "I want to bind socket S to address A; if a process is already bound to this address, just complain !" Unfortunately this is not the case...

Is there a way to enforce this "mutual exclusion" behavior ? Or, given a filesystem path, is there a way to know, via the socket API, whether any process on the system has a Unix domain socket bound to this path ? Should I use a synchronization primitive external to the socket API (flock(), ...) ? Or am I missing something ?

Thanks for your suggestions.

Note : Linux's abstract namespace Unix sockets seem to solve this issue, as there is no filesystem entry to unlink(). However, the server I'm writing aims to be generic : it must be robust against both types of Unix domain sockets, as I am not responsible for choosing listening addresses.

like image 547
Simon Malinge Avatar asked Sep 13 '11 17:09

Simon Malinge


People also ask

How does a process communicate with a domain socket?

IPC sockets (aka Unix domain sockets) enable channel-based communication for processes on the same physical device (host), whereas network sockets enable this kind of IPC for processes that can run on different hosts, thereby bringing networking into play.

How does Unix domain socket work?

Unix sockets are bidirectional. This means that every side can perform both read and write operations. While, FIFOs are unidirectional: it has a writer peer and a reader peer. Unix sockets create less overhead and communication is faster, than by localhost IP sockets.

Do UNIX domain sockets use ports?

Unix Domain Socket uses a local file on the device. It does not require network ports to be open, instead the Linux system controls who can have access to the file for communication.

How do you bind a socket in UNIX?

When a socket is created with socket(2), it exists in a name space (address family) but has no address assigned to it. bind() assigns the address specified by addr to the socket referred to by the file descriptor sockfd. addrlen specifies the size, in bytes, of the address structure pointed to by addr.


1 Answers

I know I am very late to the party and that this was answered a long time ago but I just encountered this searching for something else and I have an alternate proposal.

When you encounter the EADDRINUSE return from bind() you can enter an error checking routine that connects to the socket. If the connection succeeds, there is a running process that is at least alive enough to have done the accept(). This strikes me as being the simplest and most portable way of achieving what you want to achieve. It has drawbacks in that the server that created the UDS in the first place may actually still be running but "stuck" somehow and unable to do an accept(), so this solution certainly isn't fool-proof, but it is a step in the right direction I think.

If the connect() fails then go ahead and unlink() the endpoint and try the bind() again.

like image 94
Kean Avatar answered Oct 12 '22 05:10

Kean