I'm not sure about how pthread dataspecific works : considering the next code (found on the web), does this means i can create for example 5 threads in the main, have a call to func in only some of them (let's say 2) those threads would have the data 'key' set to something (ptr = malloc(OBJECT_SIZE) ) and the other threads would have the same key existing but with a NULL value?
static pthread_key_t key;
static pthread_once_t key_once = PTHREAD_ONCE_INIT;
static void
make_key()
{
  (void) pthread_key_create(&key, NULL);
}
func()
{
  void *ptr;
  (void) pthread_once(&key_once, make_key);
  if ((ptr = pthread_getspecific(key)) == NULL) {
    ptr = malloc(OBJECT_SIZE);
    ...
    (void) pthread_setspecific(key, ptr);
  }
  ...
}
Some explanation on how dataspecific works and how it may have been implemented in pthread (simple way) would be appreciated!
Pthread uses sys_clone() to create new threads, which the kernel sees as a new task that happens to share many data structures with other threads. To do synchronization, pthread relies heavily on futexes in the kernel.
Pthreads are implemented as user threads by the runtime library. Most portable implementation since it requires no kernel support. Fast context switches between user threads because it is handled entirely in user space without the overhead of the kernel implementing a process-level context switch.
The pthread_create() function starts a new thread in the calling process. The new thread starts execution by invoking start_routine(); arg is passed as the sole argument of start_routine().
Returned value If unsuccessful, pthread_create() returns -1 and sets errno to one of the following values: Error Code. Description. EAGAIN.
Your reasoning is correct. These calls are for thread-specific data. They're a way of giving each thread a "global" area where it can store what it needs, but only if it needs it.
The key is shared among all threads, since it's created with pthread_once() the first time it's needed, but the value given to that key is different for each thread (unless it remains set to NULL). By having the value a void* to a memory block, a thread that needs thread-specific data can allocate it and save the address for later use. And threads that don't call a routine that needs thread-specific data never waste memory since it's never allocated for them.
The one area where I have used them is to make a standard C library thread-safe. The strtok() function (as opposed to a thread-safe strtok_r() which was considered an abomination when we were doing this) in an implementation I was involved in used almost this exact same code the first time it was called, to allocate some memory which would be used by strtok() for storing information for subsequent calls. These subsequent calls would retrieve the thread-specific data to continue tokenizing the string without interfering with other threads doing the exact same thing.
It meant users of the library didn't have to worry about cross-talk between threads - they still had to ensure a single thread didn't call the function until the last one had finished but that's the same as with single-threaded code.
It allowed us to give a 'proper' C environment to each thread running in our system without the usual "you have to call these special non-standard re-entrant routines" limitations that other vendors imposed on their users.
As for implementation, from what I remember of DCE user-mode threads (which I think were the precursor to the current pthreads), each thread had a single structure which stored things like instruction pointers, stack pointers, register contents and so on. It was a very simple matter to add one pointer to this structure to achieve very powerful functionality with minimal cost. The pointer pointed to a array (linked list in some implementations) of key/pointer pairs so each thread could have multiple keys (e.g., one for strtok(), one for rand()).
The answer to your first question is yes. In simple terms, it allows each thread to allocate and save its own data. This is roughly equivalent to w/o each thread simply allocating and passing around its own data structure. The API saves you the trouble of passing the thread-local structure to all subfunctions, and allows you to look it up on demand instead.
The implementation really doesn't matter all that much (it may vary per-OS), as long as the results are the same.
You can think of it as a two-level hashmap. The key specifies which thread-local "variable" you want to access, and the second level might perform a thread-id lookup to request the per-thread value.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With