Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parallel vector resizing not speeding up

I have to my disposal 8 processors. I wanted to do parallel resizes as follows:

    vector<vector <int> > test;
    test.resize(10000);
    #pragma omp parallel num_threads(8)
    {
        #pragma omp for
        for (int i = 0;i < 10000;i++)test[i].resize(500000);
    }

I noticed that the program didn't use 100% of processor power - it used only 15%. As I changed the code for

    vector<vector <int> > test;
    test.resize(1000000);
    #pragma omp parallel num_threads(8)
    {
        #pragma omp for
        for (int i = 0;i < 1000000;i++)test[i].resize(5000);
    }

the program used about 60% of processor power. I don't understand this phenomenon - I hoped it would use 100% of processor power in bogth cases. Am I missing something here?

like image 588
piotrmizerka Avatar asked Jan 01 '23 07:01

piotrmizerka


1 Answers

On Windows, the CRT uses the built-in Windows heap implementation, which is single-threaded.

HeapAlloc locks a CriticalSection (essentially a mutex) for the duration of allocation, essentially sequentializing the allocation process.

Since vector resizing is mostly heap (re)allocation, you will not see much improvement from parallelizing it.

Serialization ensures mutual exclusion when two or more threads attempt to simultaneously allocate or free blocks from the same heap.

Setting the HEAP_NO_SERIALIZE value eliminates mutual exclusion on the heap. Without serialization, two or more threads that use the same heap handle might attempt to allocate or free memory simultaneously, likely causing corruption in the heap.

To benefit from parallel memory allocation, use a different heap allocator. For example jemalloc.

like image 150
rustyx Avatar answered Jan 12 '23 00:01

rustyx