I am testing pthread parallel code on Linux with gcc (GCC) 4.8.3 20140911, on a CentOS 7 Server.
The single thread version is simple, it is used to init a 10000 * 10000 matrix :
int main(int argc)
{
int size = 10000;
int * r = (int*)malloc(size * size * sizeof(int));
for (int i=0; i<size; i++) {
for (int j=0; j<size; j++) {
r[i * size + j] = rand();
}
}
free(r);
}
Then I wanted to see if parallel code can improve the performance:
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
int size = 10000;
void *SetOdd(void *param)
{
printf("Enter odd\n");
int * r = (int*)param;
for (int i=0; i<size; i+=2) {
for (int j=0; j<size; j++) {
r[i * size + j] = rand();
}
}
printf("Exit Odd\n");
pthread_exit(NULL);
return 0;
}
void *SetEven(void *param)
{
printf("Enter Even\n");
int * r = (int*)param;
for (int i=1; i<size; i+=2) {
for (int j=0; j<size; j++) {
r[i * size + j] = rand();
}
}
printf("Exit Even\n");
pthread_exit(NULL);
return 0;
}
int main(int argc)
{
printf("running in thread\n");
pthread_t threads[2];
int * r = (int*)malloc(size * size * sizeof(int));
int rc0 = pthread_create(&threads[0], NULL, SetOdd, (void *)r);
int rc1 = pthread_create(&threads[1], NULL, SetEven, (void *)r);
for(int t=0; t<2; t++) {
void* status;
int rc = pthread_join(threads[t], &status);
if (rc) {
printf("ERROR; return code from pthread_join() is %d\n", rc);
exit(-1);
}
printf("Completed join with thread %d status= %ld\n",t, (long)status);
}
free(r);
return 0;
}
The simple code runs for about 0.8 second, while the multiple threaded version runs for about 10 seconds!!!!!!!
I am running on a 4 core server. But why the multiple threaded version is so slow ?
rand() is neither thread-safe nor re-entrant. So you can't use rand() in multi-threaded applications.
Use rand_r() instead which is also a pseudo-random generator and is thread-safe. If you care about. Using rand_r() results in shorter execution time for your code on my system with 2 cores (roughly half the time as the single threaded version).
In both of your threads functions, do:
void *SetOdd(void *param)
{
printf("Enter odd\n");
unsigned int s = (unsigned int)time(0);
int * r = (int*)param;
for (int i=0; i<size; i+=2) {
for (int j=0; j<size; j++) {
r[i * size + j] = rand_r(&s);
}
}
printf("Exit Odd\n");
pthread_exit(NULL);
return 0;
}
Update:
While C and POSIX standards do mandate rand() to be a thread-safe function, the glibc implementation (used on Linux) actually does implement it in a thread-safe manner.
If we look at the glibc implementation of the rand(), there's a lock:
291 __libc_lock_lock (lock);
292
293 (void) __random_r (&unsafe_state, &retval);
294
295 __libc_lock_unlock (lock);
296
Any synchronization construct (mutex, conditional variable etc) is bad for performance i.e. the least number of such constructs used in the code the better it is for performance (of course, we can't avoid certain them completely in multi-threaded applications).
So only one thread can actually access the random number generator as both threads are fighting for the lock all the time. This explains why rand() leads to poor performance in multi-threaded code.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With