Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How many simultaneous readers can a pthread_rwlock have?

I have a multi-threaded application that creates 48 threads that all need to access a common attribute (stl::map). The map will only be written to when the threads start, and the rest of the time the map will be read from. This seems like the perfect use-case for a pthread_rw_lock, and all appears to be working well.

I ran across a completely unrelated seg-fault and started analyzing the core. Using gdb, I executed the command info threads and was quite surprised at the results. I observed that several threads were actually reading from the map as expected, but the strange part is that several threads were blocked in pthread_rwlock_rdlock() waiting on the rw_lock.

Here is the stack trace for a thread that is waiting on the lock:

#0  0xffffe430 in __kernel_vsyscall ()
#1  0xf76fe159 in __lll_lock_wait () from /lib/libpthread.so.0
#2  0xf76fab5d in pthread_rwlock_rdlock () from /lib/libpthread.so.0
#3  0x0804a81a in DiameterServiceSingleton::getDiameterService(void*) ()

With so many threads, its difficult to say how many were reading and how many were blocked, but I dont understand why any threads would be blocked waiting to read, considering other threads are already reading.

So here is my question: Why are some threads blocked waiting to read a rw_lock, when other threads are already reading from it? It appears as though there is a limit to the number of threads that can simultaneously read.

Ive looked at the pthread_rwlock_attr_t functions and didnt see anything related.

The OS is Linux, SUSE 11.

Here is the related code:

{
  pthread_rwlock_init(&serviceMapRwLock_, NULL);
}

// This method is called for each request processed by the threads
Service *ServiceSingleton::getService(void *serviceId)
{
  pthread_rwlock_rdlock(&serviceMapRwLock_);
    ServiceMapType::const_iterator iter = serviceMap_.find(serviceId);
    bool notFound(iter == serviceMap_.end());
  pthread_rwlock_unlock(&serviceMapRwLock_);

  if(notFound)
  {
    return NULL;
  }

  return iter->second;
}

// This method is only called when the app is starting
void ServiceSingleton::addService(void *serviceId, Service *service)
{
  pthread_rwlock_wrlock(&serviceMapRwLock_);
    serviceMap_[serviceId] = service;
  pthread_rwlock_unlock(&serviceMapRwLock_);
}

Update:

As mentioned in the comments by MarkB, if I had set pthread_rwlockattr_getkind_np() to give priority to writers, and there is a writer blocked waiting, then the observed behavior would make sense. But, Im using the default value which I believe is to give priority to readers. I just verified that there are no threads blocked waiting to write. I also update the code as suggested by @Shahbaz in the comments and get the same results.

like image 744
Brady Avatar asked Aug 08 '12 14:08

Brady


2 Answers

You merely observed the inherent performance issues involved with acquiring locks. It takes some time, and you just happened to catch those threads in the middle of it. This is particularly true when the operation protected by the lock has a very short duration.

Edit: Reading the source, glibc uses lll_lock to protect critical sections within its own pthread library data structures. The pthread_rwlock_rdlock checks several flags and increments counters, so it does those things while holding a lock. Once those are done, the lock is released with lll_unlock.

To demonstrate, I implemented a short routine that sleeps after acquiring the rwlock. The main thread waits for them to finish. But before waiting, it prints the concurrency achieved by the threads.

enum { CONC = 50 };

pthread_rwlock_t rwlock;
pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t cond = PTHREAD_COND_INITIALIZER;
unsigned count;

void *routine(void *arg)
{
    int *fds = static_cast<int *>(arg);
    pthread_rwlock_rdlock(&rwlock);
    pthread_mutex_lock(&mutex);
    ++count;
    if (count == CONC) pthread_cond_signal(&cond);
    pthread_mutex_unlock(&mutex);
    sleep(5);
    pthread_rwlock_unlock(&rwlock);
    pthread_t self = pthread_self();
    write(fds[1], &self, sizeof(self));
    return 0;
}

And the main thread waits for the counter to reach 50:

int main()
{
    int fds[2];
    pipe(fds);
    pthread_rwlock_init(&rwlock, 0);
    pthread_mutex_lock(&mutex);
    for (int i = 0; i < CONC; i++) {
        pthread_t tid;
        pthread_create(&tid, NULL, routine, fds);
    }
    while (count < CONC) pthread_cond_wait(&cond, &mutex);
    pthread_mutex_unlock(&mutex);
    std::cout << "count: " << count << std::endl;
    for (int i = 0; i < CONC; i++) {
        pthread_t tid;
        read(fds[0], &tid, sizeof(tid));
        pthread_join(tid, 0);
    }
    pthread_rwlock_destroy(&rwlock);
    pthread_exit(0);
}

Edit: Simplified the example using C++11 thread support:

enum { CONC = 1000 };
std::vector<std::thread> threads;

pthread_rwlock_t rwlock;
std::mutex mutex;
std::condition_variable cond;
unsigned count;

void *routine(int self)
{
    pthread_rwlock_rdlock(&rwlock);
    { std::unique_lock<std::mutex> lk(mutex);
      if (++count == CONC) cond.notify_one(); }
    sleep(5);
    pthread_rwlock_unlock(&rwlock);
    return 0;
}

int main()
{
    pthread_rwlock_init(&rwlock, 0);
    { std::unique_lock<std::mutex> lk(mutex);
      for (int i = 0; i < CONC; i++) {
          threads.push_back(std::thread(routine, i));
      }
      cond.wait(lk, [](){return count == CONC;}); }
    std::cout << "count: " << count << std::endl;
    for (int i = 0; i < CONC; i++) {
        threads[i].join();
    }
    pthread_rwlock_destroy(&rwlock);
    pthread_exit(0);
}
like image 78
jxh Avatar answered Oct 27 '22 02:10

jxh


As a side note, the code posted above is broken. You can't access iter->second out of the rw_lock'd section, because as soon as you unlock the rw_lock, a writer can remove any element in the map, thus invalidating any iterator on it.

I know you're not doing this in your case since you don't write anything past the beginning of the program execution, but still, it's worth mentionning.

Also, as a side note, since the behaviour your describe seems like it's serialized (writers write to the map at the begining, then the readers read a "read-only" map from now), you probably should write it like this:

int writerDoneWithMap = 0;
// pthread_cond & mutex init here

// The writer write to the map here 

// Then they signal the reader that they are done with it
while (!__sync_bool_compare_and_swap(&writerDoneWithMap, 1, writerDoneWithMap));
pthread_cond_broadcast here


// The readers will then simply do this:
while (!writerDoneWithMap) 
{
     // pthread_cond_wait here
}
// Read the data without locks.

The code above avoid any locking on the readers if the writer have finished filling the map, and in case they have not, then you resort to typical pthread_cond/mutex technics. The code above is correct if and only if you have writer THEN readers (but that's what you said), else it will fail.

like image 31
xryl669 Avatar answered Oct 27 '22 03:10

xryl669