I wanted to parallelize a for loop and found out about std::for_each
as well as its execution policies
. Surprisingly it didn't parallelize when using GCC
:
#include <iostream>
#include <algorithm>
#include <execution>
#include <chrono>
#include <thread>
#include <random>
int main() {
std::vector<int> foo;
foo.reserve(1000);
for (int i = 0; i < 1000; i++) {
foo.push_back(i);
}
std::for_each(std::execution::par_unseq,
foo.begin(), foo.end(),
[](auto &&item) {
std::cout << item << std::endl;
std::random_device dev;
std::mt19937 rng(dev());
std::uniform_int_distribution<std::mt19937::result_type> dist6(10, 100);
std::this_thread::sleep_for(std::chrono::milliseconds(dist6(rng)));
std::cout << "Thread ID: " << std::this_thread::get_id() << std::endl;
});
}
This code still runs sequentially.
Using MSVC
the code is parallelized and finishes much quicker.
GCC
:
$ gcc --version
gcc (Ubuntu 10.1.0-2ubuntu1~18.04) 10.1.0
Copyright (C) 2020 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
MSVC
:
>cl.exe
Microsoft (R) C/C++ Optimizing Compiler Version 19.27.29112 for x86
Copyright (C) Microsoft Corporation. All rights reserved.
usage: cl [ option... ] filename... [ /link linkoption... ]
CMakeLists.txt
:
cmake_minimum_required(VERSION 3.17)
project(ParallelTesting)
set(CMAKE_CXX_STANDARD 20)
add_executable(ParallelTesting main.cpp)
Is there anything specific I need to do to enable parallelization with GCC
as well?
ldd
output of my binary:
$ ldd my_binary
linux-vdso.so.1 (0x00007ffe9e6b9000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f79efaa0000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f79ef881000)
libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f79ef4ad000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f79ef295000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f79eeea4000)
/lib64/ld-linux-x86-64.so.2 (0x00007f79f041a000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f79eeb06000)
The debug
and release
version of the binary overall have the same ldd
output.
I solved it by firstly upgrading my WSL
Ubuntu
distribution from version 18.04
to 20.04
since after running sudo apt install gcc libtbb-dev
to install TBB
I still got the following error:
#error Intel(R) Threading Building Blocks 2018 is required; older versions are not supported.
This is caused by TBB
being too old.
Now with TBB 2002.1-2
installed it's working as expected:
$ sudo apt install libtbb-dev
[sudo] password for ubuntu:
Reading package lists... Done
Building dependency tree
Reading state information... Done
libtbb-dev is already the newest version (2020.1-2).
0 upgraded, 0 newly installed, 0 to remove and 10 not upgraded.
This answer describes all the details very well.
Since I'm using CMake
I also had to add the following line to my CMakeLists.txt
:
# Link against the dependency of Intel TBB (for parallel C++17 algorithms)
target_link_libraries(${PROJECT_NAME} tbb)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With