Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Are there compiler optimization issues with sharing variables between threads?

Consider the following example. The goal is to use two threads, one to "compute" a value and one that consumes and uses the computed value (I've tried to simplify this). The computing thread signals to the other thread that the value has been computed and is ready by using a condition variable, after which the waiting thread consumes the value.

// Hopefully this is free from errors, if not, please point them out so I can fix
// them and we can focus on the main question
#include <pthread.h>
#include <stdio.h>

// The data passed to each thread. These could just be global variables.
typedef struct ThreadData
{
  pthread_mutex_t mutex;
  pthread_cond_t cond;
  int spaceHit;
} ThreadData;

// The "computing" thread... just asks you to press space and checks if you did or not
void* getValue(void* td)
{
  ThreadData* data = td;

  pthread_mutex_lock(&data->mutex);

  printf("Please hit space and press enter\n");
  data->spaceHit = getchar() == ' ';
  pthread_cond_signal(&data->cond);

  pthread_mutex_unlock(&data->mutex);

  return NULL;
}

// The "consuming" thread... waits for the value to be set and then uses it
void* watchValue(void* td)
{
  ThreadData* data = td;

  pthread_mutex_lock(&data->mutex);
  if (!data->spaceHit)
      pthread_cond_wait(&data->cond, &data->mutex);
  pthread_mutex_unlock(&data->mutex);

  if (data->spaceHit)
      printf("You hit space!\n");
  else
    printf("You did NOT hit space!\n");

  return NULL;
}

int main()
{
  // Boring main function. Just initializes things and starts the two threads.
  pthread_t threads[2];
  pthread_attr_t attr;
  ThreadData data;
  data.spaceHit = 0;

  pthread_mutex_init(&data.mutex, NULL);
  pthread_cond_init(&data.cond, NULL);

  pthread_attr_init(&attr);
  pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_JOINABLE);
  pthread_create(&threads[0], &attr, watchValue, &data);
  pthread_create(&threads[1], &attr, getValue, &data);

  pthread_join(threads[0], NULL);
  pthread_join(threads[1], NULL);

  pthread_attr_destroy(&attr);
  pthread_mutex_destroy(&data.mutex);
  pthread_cond_destroy(&data.cond);

  return 0;
}

My main question has to do with potential optimizations done by the compiler. Is the compiler allowed to do tricky optimizations and "optimize" the program flow such that the following happens:

void* watchValue(void* td)
{
  ThreadData* data = td;

  pthread_mutex_lock(&data->mutex);
  if (!data->spaceHit) // Here, it might remember the result of data->spaceHit
      pthread_cond_wait(&data->cond, &data->mutex);
  pthread_mutex_unlock(&data->mutex);

  if (remember the old result of data->spaceHit without re-getting it)
      printf("You hit space!\n");
  else
    printf("You did NOT hit space!\n");
  // The above if statement now may not execute correctly because it didn't
  // re-get the value of data->spaceHit, but "remembered" the old result
  // from the if statement a few lines above

  return NULL;
}

I'm a bit paranoid that the compiler's static analysis might determine that data->spaceHit does not change between the two if statements, and thus justify using the old value of data->spaceHit instead of re-getting the new value. I don't know enough about threading and compiler optimizations to know if this code is safe or not. Is it?


Note: I've written this in C, and tagged this as C and C++. I'm using this in a C++ library, but since I'm using C threading APIs (pthreads and Win32 threads) and have the option to embed C in this portion of the C++ library, I've tagged this as both C and C++.

like image 622
Cornstalks Avatar asked May 29 '13 01:05

Cornstalks


1 Answers

No, the compiler is not allowed to cache the value of data->spaceHit across the calls to pthread_cond_wait() or pthread_mutex_unlock(). These are both specifically called out as "functions [which] synchronize memory with respect to other threads", which must necessarily act as compiler barriers.

For a compiler to be part of a conforming pthreads implementation, it must not perform that optimisation in the case you've given.

like image 200
caf Avatar answered Nov 15 '22 19:11

caf