Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I take ownership of an abandoned boost::interprocess::interprocess_mutex?

My scenario: one server and some clients (though not many). The server can only respond to one client at a time, so they must be queued up. I'm using a mutex (boost::interprocess::interprocess_mutex) to do this, wrapped in a boost::interprocess::scoped_lock.

The thing is, if one client dies unexpectedly (i.e. no destructor runs) while holding the mutex, the other clients are in trouble, because they are waiting on that mutex. I've considered using timed wait, so if I client waits for, say, 20 seconds and doesn't get the mutex, it goes ahead and talks to the server anyway.

Problems with this approach: 1) it does this everytime. If it's in a loop, talking constantly to the server, it needs to wait for the timeout every single time. 2) If there are three clients, and one of them dies while holding the mutex, the other two will just wait 20 seconds and talk to the server at the same time - exactly what I was trying to avoid.

So, how can I say to a client, "hey there, it seems this mutex has been abandoned, take ownership of it"?

like image 810
Pedro d'Aquino Avatar asked Jul 24 '09 19:07

Pedro d'Aquino


1 Answers

Unfortunately, this isn't supported by the boost::interprocess API as-is. There are a few ways you could implement it however:

If you are on a POSIX platform with support for pthread_mutexattr_setrobust_np, edit boost/interprocess/sync/posix/thread_helpers.hpp and boost/interprocess/sync/posix/interprocess_mutex.hpp to use robust mutexes, and to handle somehow the EOWNERDEAD return from pthread_mutex_lock.

If you are on some other platform, you could edit boost/interprocess/sync/emulation/interprocess_mutex.hpp to use a generation counter, with the locked flag in the lower bit. Then you can create a reclaim protocol that will set a flag in the lock word to indicate a pending reclaim, then do a compare-and-swap after a timeout to check that the same generation is still in the lock word, and if so replace it with a locked next-generation value.

If you're on windows, another good option would be to use native mutex objects; they'll likely be more efficient than busy-waiting anyway.

You may also want to reconsider the use of a shared-memory protocol - why not use a network protocol instead?

like image 146
bdonlan Avatar answered Nov 06 '22 23:11

bdonlan