I made a program which calculates prime numbers on multiple cores. (Please ignore the fact, that the algorithm is not perfectly effective and numbers 0 and 1 are here considered to be primes. The purpose is only to practice using threads.)
The variable taken
(a number to be tested next) is beeing shared across 8 threads.
The problem is that it can be incremented by one thread and right after that by another thread and read by them when its already incremented twice (or more times), so some values can be skipped, which is a bad thing.
I assumed it can be solved by using std::atomic_uint
as the variable type, but I was apparently wrong.
Is there any way I could solve this problem without the need of using std::mutex
since I heard it causes considerable overhead?
source code:
#include <iostream>
#include <chrono>
#include <vector>
#include <algorithm>
#include <thread>
#include <atomic>
int main()
{
const uint MAX = 1000;
std::vector<bool> isPrime(MAX), done(MAX);
std::fill(done.begin(), done.end(), false);
std::atomic_uint taken{0}; //shared variable
std::vector<std::thread> threads;
auto start = std::chrono::system_clock::now();
for (uint i = 0; i < 8; ++i) {
threads.emplace_back(
[&](){
bool res;
for (uint tested; (tested = taken.fetch_add(1)) < MAX; ) { //taken should be incremented and copied atomically
res = true;
for (uint k = 2; k < tested; ++k) {
if (tested % k == 0) {
res = false;
break;
}
}
isPrime[tested] = res;
done[tested] = true;
}
}
);
}
for (auto & t : threads) {
t.join();
}
auto end = std::chrono::system_clock::now();
auto milliseconds = std::chrono::duration_cast<std::chrono::milliseconds>(end - start);
uint num = std::count_if(isPrime.begin(), isPrime.end(), [](bool b){return b;});
uint nDone = std::count_if(done.begin(), done.end(), [](bool b){return !b;});
std::cout << "number: " << num << " duration: " << milliseconds.count() << '\n';
std::cout << "not done: " << nDone << '\n';
for (uint i = 0; i < MAX; ++i) { //Some numbers are always skipped
if (!done[i]) {
std::cout << i << ", ";
}
}
std::cout << '\n';
return 0;
}
The code was compiled using g++
with -O3
and -pthread
arguments.
output:
number: 169 duration: 1
not done: 23
143, 156, 204, 206, 207, 327, 328, 332, 334, 392, 393, 396, 502, 637, 639, 671, 714, 716, 849, 934, 935, 968, 969,
The output is different every time.
The specialization std::vector<bool>
compresses values to individual bits. Therefore, there are multiple vector elements in a single byte, i.e., in a single memory location. Consequently, your threads updates same memory locations without synchronization, which is a data race (and therefore undefined behavior according to Standard).
Try to change std::vector<bool>
to std::vector<char>
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With