I intend to use buffers std::vector<size_t> buffer(100)
, one in each thread in a parallelization of a loop, as suggested by this code:
std::vector<size_t> buffer(100);
#pragma omp parallel for private(buffer)
for(size_t j = 0; j < 10000; ++j) {
// ... code using the buffer ...
}
This code does not work. Although there is a buffer for every thread, those can have size 0.
How can I allocate the buffer in the beginning of each thread? Can I still use #pragma omp parallel for
? And can I do it more elegantly than this:
std::vector<size_t> buffer;
#pragma omp parallel for private(buffer)
for(size_t j = 0; j < 10000; ++j) {
if(buffer.size() != 100) {
#pragma omp critical
buffer.resize(100);
}
// ... code using the buffer ...
}
The question and the accepted answer have been around for a while, here are some further information which provide additional insight into openMP and therefore might be helpful to other users.
In C++, the private
and firstprivate
clause handle class objects differently:
From the OpenMP Application Program Interface v3.1:
private: the new list item is initialized, or has an undefined initial value, as if it had been locally declared without an initializer. The order in which any default constructors for different private variables of class type are called is unspecified.
firstprivate: for variables of class type, a copy constructor is invoked to perform the initialization of list variables.
i.e. private
calls the default constructor, whereas firstprivate
calls the copy constructor of the corresponding class.
The default constructor of std::vector
constructs an empty container with no elements, this is why the buffers have size 0.
To answer the question, this would be an other solution with no need to split the OpenMP region:
std::vector<size_t> buffer(100, 0);
#pragma omp parallel for firstprivate(buffer)
for (size_t j = 0; j < 10000; ++j) {
// use the buffer
}
EDIT a word of caution regarding private variables in general: the thread stack size is limited and unless explicitly set (environment variable OMP_STACKSIZE
) compiler dependent. If you use private variables with a large memory footprint, stack overflow may become an issue.
Split the OpenMP region as shown in this question.
Then declare the vector inside the outer-region, but outside the for-loop itself. This will make one local vector for each thread.
#pragma omp parallel
{
std::vector<size_t> buffer(100);
#pragma omp for
for(size_t j = 0; j < 10000; ++j) {
{
// ... code using the buffer ...
}
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With