This is my follow up to the previous post on memory management issues. The following are the issues I know.
1)data races (atomicity violations and data corruption)
2)ordering problems
3)misusing of locks leading to dead locks
4)heisenbugs
Any other issues with multi threading ? How to solve them ?
Eric's list of four issues is pretty much spot on. But debugging these issues is tough.
For deadlock, I've always favored "leveled locks". Essentially you give each type of lock a level number. And then require that a thread aquire locks that are monotonic.
To do leveled locks, you can declare a structure like this:
typedef struct {
os_mutex actual_lock;
int level;
my_lock *prev_lock_in_thread;
} my_lock_struct;
static __tls my_lock_struct *last_lock_in_thread;
void my_lock_aquire(int level, *my_lock_struct lock) {
if (last_lock_in_thread != NULL) assert(last_lock_in_thread->level < level)
os_lock_acquire(lock->actual_lock)
lock->level = level
lock->prev_lock_in_thread = last_lock_in_thread
last_lock_in_thread = lock
}
What's cool about leveled locks is the possibility of deadlock causes an assertion. And with some extra magic with FUNC and LINE you know exactly what badness your thread did.
For data races and lack of synchronization, the current situation is pretty poor. There are static tools that try to identify issues. But false positives are high.
The company I work for ( http://www.corensic.com ) has a new product called Jinx that actively looks for cases where race conditions can be exposed. This is done by using virtualization technology to control the interleaving of threads on the various CPUs and zooming in on communication between CPUs.
Check it out. You probably have a few more days to download the Beta for free.
Jinx is particularly good at finding bugs in lock free data structures. It also does very well at finding other race conditions. What's cool is that there are no false positives. If your code testing gets close to a race condition, Jinx helps the code go down the bad path. But if the bad path doesn't exist, you won't be given false warnings.
Unfortunately there's no good pill that helps automatically solve most/all threading issues. Even unit tests that work so well on single-threaded pieces of code may never detect an extremely subtle race condition.
One thing that will help is keeping the thread-interaction data encapsulated in objects. The smaller the interface/scope of the object, the easier it will be to detect errors in review (and possibly testing, but race conditions can be a pain to detect in test cases). By keeping a simple interface that can be used, clients that use the interface will also be correct just by default. By building up a bigger system from lots of smaller pieces (only a handful of which actually do thread-interaction), you can go a long way towards averting threading errors in the first place.
The four most common problems with theading are
1-Deadlock
2-Livelock
3-Race Conditions
4-Starvation
How to solve [issues with multi threading]?
A good way to "debug" MT applications is through logging. A good logging library with extensive filtering options makes it easier. Of course, logging itself influences the timing, so you still can have "heisenbugs", but it's much less likely than when you're actuall breaking into the debugger.
Prepare and plan for that. Include a good logging facility into your application from the start.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With