Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can I safelly share a variable across threads in C++ using only std::atomic without std::mutex?

I made a program which calculates prime numbers on multiple cores. (Please ignore the fact, that the algorithm is not perfectly effective and numbers 0 and 1 are here considered to be primes. The purpose is only to practice using threads.)

The variable taken (a number to be tested next) is beeing shared across 8 threads.

The problem is that it can be incremented by one thread and right after that by another thread and read by them when its already incremented twice (or more times), so some values can be skipped, which is a bad thing.

I assumed it can be solved by using std::atomic_uint as the variable type, but I was apparently wrong.

Is there any way I could solve this problem without the need of using std::mutex since I heard it causes considerable overhead? source code:

#include <iostream>
#include <chrono>
#include <vector>
#include <algorithm>
#include <thread>
#include <atomic>

int main()
{
    const uint MAX = 1000;

    std::vector<bool> isPrime(MAX), done(MAX);
    std::fill(done.begin(), done.end(), false);
    std::atomic_uint taken{0}; //shared variable
    std::vector<std::thread> threads;
    auto start = std::chrono::system_clock::now();

    for (uint i = 0; i < 8; ++i) {
        threads.emplace_back(
            [&](){
                bool res;
                for (uint tested; (tested = taken.fetch_add(1)) < MAX; ) { //taken should be incremented and copied atomically
                    res = true;
                    for (uint k = 2; k < tested; ++k) {
                        if (tested % k == 0) {
                            res = false;
                            break;
                        }
                    }
                    isPrime[tested] = res;
                    done[tested] = true;
                }
            }
        );
    }
    for (auto & t : threads) {
        t.join();
    }

    auto end = std::chrono::system_clock::now();
    auto milliseconds = std::chrono::duration_cast<std::chrono::milliseconds>(end - start);
    uint num = std::count_if(isPrime.begin(), isPrime.end(), [](bool b){return b;});
    uint nDone = std::count_if(done.begin(), done.end(), [](bool b){return !b;});
    std::cout << "number: " << num << " duration: " << milliseconds.count() << '\n';
    std::cout << "not done: " << nDone << '\n';
    for (uint i = 0; i < MAX; ++i) { //Some numbers are always skipped
        if (!done[i]) {
            std::cout << i << ", ";
        }
    }
    std::cout << '\n';
    return 0;
}

The code was compiled using g++ with -O3 and -pthread arguments. output:

number: 169 duration: 1
not done: 23
143, 156, 204, 206, 207, 327, 328, 332, 334, 392, 393, 396, 502, 637, 639, 671, 714, 716, 849, 934, 935, 968, 969,

The output is different every time.

like image 702
Petr Král Avatar asked Jan 16 '18 15:01

Petr Král


1 Answers

The specialization std::vector<bool> compresses values to individual bits. Therefore, there are multiple vector elements in a single byte, i.e., in a single memory location. Consequently, your threads updates same memory locations without synchronization, which is a data race (and therefore undefined behavior according to Standard).

Try to change std::vector<bool> to std::vector<char>.

like image 187
Daniel Langr Avatar answered Oct 16 '22 20:10

Daniel Langr