Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Class declaring itself (*this) private to avoid race conditions / quest for threadprivate in gcc abandoned

Tags:

c++

gcc

openmp

I want to avoid a race condition in parallel code. The issue is that my class contains several global variables, let's say just one x for simplicity as well as a for loop that I wish to make parallel. The actual code also has a method that takes a pointer to a class, in this case itself, as its argument, accessing even more global variables. So it might make sense to make the entire instance threadprivate. I am using OpenMP.

A minimum working example is:

#include <iostream>
#include <omp.h>
class lotswork {
public:
    int x;
    int f[10];

    lotswork(int i = 0) { x = i; };

    void addInt(int y) { x = x + y; }

    void carryout(){

        #pragma omp parallel for
        for (int n = 0; n < 10; ++n) {
            this->addInt(n);
            f[n] = x;
        }
        for(int j=0;j<10;++j){
            std::cout << " array at " << j << " = " << f[j] << std::endl;
        }
        std::cout << "End result = " << x << std::endl;
    }
};



int main() {
    lotswork production(0);
    #pragma omp threadprivate(production)
    production.carryout();

}

My question is, how can I do this? Using the keyword threadprivate returns the following compiler error message: error: ‘production’ declared ‘threadprivate’ after first use I think this compiler issue here still hasn't been solved:

This brings us to why I used the Intel compiler. Visual Studio 2013 as well as g++ (4.6.2 on my computer, Coliru (g++ v5.2), codingground (g++ v4.9.2)) allow only POD types (source). This is listed as a bug for almost a decade and still hasn't been fully addressed. The Visual Studio error given is error C3057: 'globalClass' : dynamic initialization of 'threadprivate' symbols is not currently supported and the error given by g++ is error: 'globalClass' declared 'threadprivate' after first use The Intel compiler works with classes.

Unfortunately, I haven't got access to Intel's compiler but use GCC 8.1.0. I did a bit of background research and found a discussion on this here, but that trail runs cold, ten years ago. I am asking this question because several people have had issues with this and solved it either by declaring a class pointer as here or proposing terrible workarounds. The latter approach seems misguided because a pointer is usually declared as a constant but then we have threadprivate pointers while the instance is still shared.

Attempt at solution

I believe I can use the private keyword but am unsure how to do this with an entire instance of a class although I'd prefer the threadprivate keyword. A similar example to mine above on which I modeled my MWE has also been discussed in Chapter 7, Figure 7.17 in this book, but without solution. (I am well aware about the race condition and why it's a problem.)

If necessary I can give evidence that the output of the above programme without any extra keywords is nondeterministic.

Another attempt at solution

I have now thought of a solution but for some reason, it won't compile. From a thread-safety and logical standpoint my problem should be solved by the following code. Yet, there must be some sort of error.

#include <iostream>
#include <omp.h>
class lotswork : public baseclass {
public:
    int x;
    int f[10];

    lotswork(int i = 0) { x = i; };

    void addInt(int y) { x = x + y; }
    
        void carryout(){
    //idea is to declare the instance private
    #pragma omp parallel firstprivate(*this){
    //here, another instance of the base class will be instantiated which is inside the parallel region and hence automatically private
    baseclass<lotswork> solver;

  #pragma omp for
  for (int n = 0; n < 10; ++n) 
      {
          this->addInt(n);
          f[n] = x;
          solver.minimize(*this,someothervariablethatisprivate);
      }
                                             } //closing the pragma omp parallel region
                for(int j=0;j<10;++j){
                    std::cout << " array at " << j << " = " << f[j] << std::endl;
                }
                std::cout << "End result = " << x << std::endl;
            }
        };
        
        
        
    int main() {
        lotswork production(0);
        #pragma omp threadprivate(production)
        production.carryout();
    
    }

So this code, based on the definitions, should do the trick but somehow it doesn't compile. How can I put this code together so it achieves the desired thread-safety and compiles, respecting the constraint that threadprivate is not an option for non-Intel compiler folks?

like image 784
Hirek Avatar asked Jan 02 '19 23:01

Hirek


2 Answers

It seems like there is some confusion about OpenMP constructs here. threadprivate is used, much like thread_local, to create a per-thread copy of an object of static lifetime, either a global or a static variable. As noted, there are some implementation issues with this, but even if the implementations could handle the class, using threadprivate on a non-static local variable would produce an error.

As to the error, it's hard to say without output, but it is likely multiple things:

  1. The unmatched closing brace. Placing a { on the end of a pragma line does not open a block, it needs to be on the following line.
  2. It is not valid to privatize an enclosing class instance that way

If you need to create a private copy of the enclosing class in each thread, it's possible by either copy-constructing the class into a variable declared inside a parallel region:

#pragma omp parallel
{
  lotswork tmp(*this);
  // do things with private version
}

Note however that the entire thing is private, so this means that f in the original copy will not be updated unless you perform the addInt equivalents all on the private copies then the f[n] assignments on the original.

Edit: I originally mentioned using the default(firstprivate) clause, but the default clause only offers private and first private for FORTRAN. To get the same effect in c++, do the above and copy construct into a new instance of each, or use a lambda with capture by value by default then firstprivate that, *this requires c++17 to work, but does exactly what's requested:

auto fn = [=,*this](){
  // do things with private copies
  // all updates to persist in shared state through pointers
};
#pragma omp parallel firstprivate(fn)
fn();
like image 144
Tom Scogland Avatar answered Nov 20 '22 22:11

Tom Scogland


This is a long-standing missing GCC feature:

  • OpenMP threadprivate directive does not work with non-POD types

With current GCC versions, thread_local is expected to work, though:

int main() {
  thread_local lotswork production(0);
  production.carryout();
}

However, I do not think this will work in your case because the parallel loop in carryout will still operate on a single lotswork instance. I believe this would apply to the original code using threadprivate, too. You probably need to move the parallel loop outside of the carryout member function.

like image 4
Florian Weimer Avatar answered Nov 20 '22 23:11

Florian Weimer