Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using the OpenMP threadprivate directive on static instances of C++ STL types

Tags:

c++

stl

openmp

Consider the following snippet:

#include <map>

class A {
    static std::map<int,int> theMap;
#pragma omp threadprivate(theMap)
};

std::map<int,int> A::theMap;

Compilation with OpenMP fails with the following error message:

$ g++ -fopenmp -c main.cpp 
main.cpp:5:34: error: ‘threadprivate’ ‘A::theMap’ has incomplete type

I don't understand this. I can compile without the #pragma directive, which should mean that std::map is not incomplete. I can also compile if theMap is a primitive type (double, int...).

How do I make a global static std::map threadprivate?

like image 560
Arek' Fu Avatar asked Nov 08 '11 13:11

Arek' Fu


People also ask

What is Threadprivate Openmp?

SummaryThe threadprivate directive specifies that variables are replicated, with each thread having its own copy. The threadprivate directive is a declarative directive.

How does Pragma OMP parallel for work?

#pragma omp parallel spawns a group of threads, while #pragma omp for divides loop iterations between the spawned threads. You can do both things at once with the fused #pragma omp parallel for directive.

What is thread private?

The THREADPRIVATE directive allows you to specify named common blocks and named variables as private to a thread but global within that thread. Once you declare a common block or variable THREADPRIVATE, each thread in the team maintains a separate copy of that common block or variable.


2 Answers

This is a compiler restriction. Intel C/C++ compiler supports C++ classes on threadprivate while gcc and MSVC currently cannot.

For example, in MSVC (VS 2010), you will get this error (I removed the class):

static std::map<int,int> theMap;
#pragma omp threadprivate(theMap)

error C3057: 'theMap' : dynamic initialization of 'threadprivate' symbols is not currently supported

So, the workaround is pretty obvious, but dirty. You need to make a very simple thread-local storage. A simple approach would be:

const static int MAX_THREAD = 64;

struct MY_TLS_ITEM
{
  std::map<int,int> theMap;
  char padding[64 - sizeof(theMap)];
};

__declspec(align(64)) MY_TLS_ITEM tls[MAX_THREAD];

Note that the reason why I have padding is to avoid false sharing. I assume that 64-byte cache line for modern Intel x86 processors. __declspec(align(64)) is a MSVC extension that the structure is on the boundary of 64. So, any elements in tls will be located on a different cache line, resulting in no false sharing. GCC has __attribute__ ((aligned(64))).

In order to access this simple TLS, you can do this:

tls[omp_get_thread_num()].theMap;

Of course, you should call this inside one of OpenMP parallel constructs. The nice thing is that OpenMP provides an abstracted thread ID in [0, N), where N is the maximum thread number. This enables a fast and simple TLS implementation. In general, a native TID from operating system is an arbitrary integer number. So, you mostly need to have a hash table whose access time is longer than a simple array.

like image 177
minjang Avatar answered Oct 18 '22 05:10

minjang


The incomplete type error is a bug in the compiler which can be worked around by instantiating std::map<int,int> before the threadprivate directive. But once you get past that issue GCC 4.7 still doesn't support dynamic initialization of threadprivate variables. This will be supported in GCC 4.8.

like image 29
Jason Merrill Avatar answered Oct 18 '22 03:10

Jason Merrill