Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Multi-threaded C program much slower in OS X than Linux

I wrote this for an OS class assignment that I've already completed and handed in. I posted this question yesterday, but due to "Academic Honesty" regulations I took it off until after the submission deadline.

The object was to learn how to use critical sections. There is a data array with 100 monotonously increasing numbers, 0...99, and 40 threads that randomly swap two elements 2,000,000 times each. Once a second a Checkergoes through and makes sure that there is only one of each number (which means that no parallel access happened).

Here were the Linux times:

real    0m5.102s
user    0m5.087s
sys     0m0.000s

and the OS X times

real    6m54.139s
user    0m41.873s
sys     6m43.792s

I run a vagrant box with ubuntu/trusty64 on the same machine that is running OS X. It is a quad-core i7 2.3Ghz (up to 3.2Ghz) 2012 rMBP.

If I understand correctly, sys is system overhead, which I have no control over, and even then, 41s of user time suggests that perhaps the threads are running serially.

I can post all the code if needed, but I will post the bits I think are relevant. I am using pthreads since that's what Linux provides, but I assumed they work on OS X.

Creating swapper threads to run swapManyTimes routine:

for (int i = 0; i < NUM_THREADS; i++) {
    int err = pthread_create(&(threads[i]), NULL, swapManyTimes, NULL);
}

Swapper thread critical section, run in a for loop 2 million times:

pthread_mutex_lock(&mutex);    // begin critical section
int tmpFirst = data[first];
data[first] = data[second];
data[second] = tmpFirst;
pthread_mutex_unlock(&mutex);  // end critical section

Only one Checker thread is created, same way as Swapper. It operates by going over the data array and marking the index corresponding to each value with true. Afterwards, it checks how many indices are empty. as such:

pthread_mutex_lock(&mutex);
for (int i = 0; i < DATA_SIZE; i++) {
    int value = data[i];
    consistency[value] = 1;
}
pthread_mutex_unlock(&mutex); 

It runs once a second by calling sleep(1) after it runs through its while(1) loop. After all swapper threads are joined this thread is cancelled and joined as well.

I would be happy to provide any more information that can help figure out why this sucks so much on Mac. I'm not really looking for help with code optimization, unless that's what's tripping up OS X. I've tried building it using both clang and gcc-4.9 on OS X.

like image 454
Alex Popov Avatar asked Mar 05 '15 22:03

Alex Popov


People also ask

Why is multithreading slow?

This depends rather on how many CPUs your code gets given to run on by the OS. Each of these threads is CPU bound so if you have just the one CPU it's going to run one thread for a bit, timeslice it, run the next thread, etc, which won't be any faster and may well be slower, depending on the overhead of a thread swap.

Is multi thread faster than single thread?

On a single core CPU, a single process (no separate threads) is usually faster than any threading done. Threads do not magically make your CPU go any faster, it just means extra work.

Is multi threading possible in C?

Can we write multithreading programs in C? Unlike Java, multithreading is not supported by the language standard. POSIX Threads (or Pthreads) is a POSIX standard for threads. Implementation of pthread is available with gcc compiler.

Does multithreading increase latency?

Multithreading is a useful mechanism for reducing latency.


1 Answers

MacOSX and Linux implement pthread differently, causing this slow behavior. Specifically MacOSX does not use spinlocks (they are optional according to ISO C standard). This can lead to very, very slow code performance with examples like this one.

like image 152
Log_n Avatar answered Oct 05 '22 07:10

Log_n