Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pthread_cond_timedwait() root causes of EINVAL

Tags:

pthreads

I am seeing in very rare cases pthread_cond_timedwait() return EINVAL and cause a fatal crash on our system. I understand that this means one of the parameters passed in has to be invalid, but how does the mutex or cond variable become invalid?

Is there any way to check these arguments before calling pthread_cond_timedwait() to prevent a crash?

like image 825
Jon Kump Avatar asked Jul 03 '12 20:07

Jon Kump


People also ask

What is Pthread_cond_timedwait?

The pthread_cond_wait() and pthread_cond_timedwait() functions are used to block on a condition variable. They are called with mutex locked by the calling thread or undefined behaviour will result.

What does Pthread_cond_wait return?

The pthread_cond_wait() routine always returns with the mutex locked and owned by the calling thread, even when returning an error. This function blocks until the condition is signaled. The function atomically releases the associated mutex lock before blocking, and atomically acquires the mutex again before returning.

Is it or is it not necessary to re check the predicate condition for a critical section after returning from Pthread_cond_wait ()? Why?

You still need to check your predicate regardless to account for potential spurious wakeups. The purpose of the mutex is not to protect the condition variable; it is to protect the predicate on which the condition variable is being used as a signaling mechanism.


2 Answers

It is unspecified as exaclty what constitutes as invalid, but here are a few reasons that I have observed pthread_cond_timedwait returning EINVAL:

  • The condition and/or mutex was not initialized properly. Check the initialization return results, and verify that the correct pthread library is explicitly being linked. Sporadic issues may occur when various versions of glibc are linked, resulting in difficult to debug cases where the init call returns success, but the object is not correctly initialized.
  • Any undefined behavior may result in an invalid internal state, which may or may not be detected by the pthread calls. Undefined behavior can result from:
    • Initializing the mutex or condition variable more than once without it being destroyed.
    • Using the mutex or condition variable after it has been destroyed, but before it has been re-initialized.
    • The condition and/or mutex was manually written over by application code.
    • The mutex or condition variable was destroyed while a thread was waiting on the object.
  • Different mutexes are used with the same condition variable.
  • The abstime argument had a tv_nsec value of less than 0 or greater than 1,000,000,000.

Without manually mimicking the validation calls that pthread is doing, then I do not know of a way to check the arguments before calling pthread_cond_timewait(). However, pthread_cond_timewait() returning EINVAL should not cause a fatal crash, as it is a specified case. Consider examining other areas of application code that may not handle the return results appropriately. For example, code that assumes success as long as the return was not ETIMEDOUT.

like image 115
Tanner Sansbury Avatar answered Sep 24 '22 20:09

Tanner Sansbury


I'd like to share my experience on this issue, the time value is 'timespec', and its 'tv_nsec' range should be kept inside [0, 999999999], thus if you set the nano value more than 1 second, some linux may return EINVAL!

struct timespec {
    time_t tv_sec;      /* Seconds */
    long   tv_nsec;     /* Nanoseconds [0 .. 999999999] */ 
};

Hope this help you out of trouble.

like image 28
user652680 Avatar answered Sep 25 '22 20:09

user652680