Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C++ threaded application runs slower than non-threaded

I am currently writing a prime number generator in C++. I have made a single-threaded version first, and later a multi-threaded version.

I found out that if my program generates values less than 100'000, the single-threaded version is faster than the multi-threaded one. Obviously I am doing something wrong.

My code below:

#include <iostream>
#include <fstream>
#include <set>
#include <string>
#include <thread>
#include <mutex>
#include <shared_mutex>

using namespace std;

set<unsigned long long> primeContainer;
shared_mutex m;

void checkPrime(const unsigned long long p)
{
    if (p % 3 == 0)
        return;

    bool isPrime = true;
    for (set<unsigned long long>::const_iterator it = primeContainer.cbegin(); it != primeContainer.cend(); ++it)
    {
        if (p % *it == 0)
        {
            isPrime = false;
            break;
        }
        if (*it * *it > p) // check only up to square root
            break;
    }

    if (isPrime)
        primeContainer.insert(p);
}

void checkPrimeLock(const unsigned long long p)
{
    if (p % 3 == 0)
        return;

    bool isPrime = true;
    try
    {
        shared_lock<shared_mutex> l(m);
        for (set<unsigned long long>::const_iterator it = primeContainer.cbegin(); it != primeContainer.cend(); ++it)
        {
            if (p % *it == 0)
            {
                isPrime = false;
                break;
            }
            if (*it * *it > p)
                break;
        }
    }
    catch (exception& e)
    {
        cout << e.what() << endl;
        system("pause");
    }

    if (isPrime)
    {
        try
        {
            unique_lock<shared_mutex> l(m);
            primeContainer.insert(p);
        }
        catch (exception& e)
        {
            cout << e.what() << endl;
            system("pause");
        }
    }
}

void runLoopThread(const unsigned long long& l)
{
    for (unsigned long long i = 10; i < l; i += 10)
    {
        thread t1(checkPrimeLock, i + 1);
        thread t2(checkPrimeLock, i + 3);
        thread t3(checkPrimeLock, i + 7);
        thread t4(checkPrimeLock, i + 9);
        t1.join();
        t2.join();
        t3.join();
        t4.join();
    }
}

void runLoop(const unsigned long long& l)
{
    for (unsigned long long i = 10; i < l; i += 10)
    {
        checkPrime(i + 1);
        checkPrime(i + 3);
        checkPrime(i + 7);
        checkPrime(i + 9);
    }
}

void printPrimes(const unsigned long long& l)
{
    if (1U <= l)
        cout << "1 ";
    if (2U <= l)
        cout << "2 ";
    if (3U <= l)
        cout << "3 ";
    if (5U <= l)
        cout << "5 ";

    for (auto it = primeContainer.cbegin(); it != primeContainer.cend(); ++it)
    {
        if (*it <= l)
            cout << *it << " ";
    }
    cout << endl;
}

void writeToFile(const unsigned long long& l)
{
    string name = "primes_" + to_string(l) + ".txt";
    ofstream f(name);

    if (f.is_open())
    {
        if (1U <= l)
            f << "1 ";
        if (2U <= l)
            f << "2 ";
        if (3U <= l)
            f << "3 ";
        if (5U <= l)
            f << "5 ";

        for (auto it = primeContainer.cbegin(); it != primeContainer.cend(); ++it)
        {
            if (*it <= l)
                f << *it << " ";
        }
    }
    else
    {
        cout << "Error opening file." << endl;
        system("pause");
    }
}

int main()
{
    unsigned int n = thread::hardware_concurrency();
    std::cout << n << " concurrent threads are supported." << endl;

    unsigned long long limit;
    cout << "Please enter the limit of prime generation: ";
    cin >> limit;

    primeContainer.insert(7);

    if (10 < limit)
    {
        //runLoop(limit); //single-threaded
        runLoopThread(limit); //multi-threaded
    }

    printPrimes(limit);
    //writeToFile(limit);
    system("pause");
    return 0;
}

In the main function you will find comments about which function is single and multi threaded.

The main difference between those is the use of locks, shared for container iteration, and unique for insertion. If it matters, my CPU has 4 cores.

Why is the single thread version faster?

like image 633
IDDQD Avatar asked May 15 '16 20:05

IDDQD


People also ask

Why is multithreading slower than single thread?

Every thread needs some overhead and system resources, so it also slows down performance. Another problem is the so called "thread explosion" when MORE thread are created than cores are on the system. And some waiting threads for the end of other threads is the worst idea for multi threading.

Are multithreaded programs faster?

Therefore, multithreaded programs can run much faster than on a uniprocessor system. They can also be faster than a program using multiple processes, because threads require fewer resources and generate less overhead.

Can multithreading decrease performance?

Disadvantages. Depending on the design and architecture of the processor, simultaneous multithreading can decrease performance if any of the shared resources are bottlenecks for performance.

Does threading improve performance?

If you're unsatisfied with the processing output of your computer, it might be time to hyper-thread your central processing unit's (CPU) cores. Hyper-threading can be a great way to improve the processing speed of your PC without having to go through a major hardware makeover.


1 Answers

You have several problems.

First, you keep creating and destroying threads needlessly. Have each thread loop doing work until there is no more work to do.

Second, your locks are way too fine and as a result, you're acquiring them way too often. Have each thread grab a block of 100 numbers to test rather than one at a time, and have them insert found primes from each block in one go.

like image 174
David Schwartz Avatar answered Nov 02 '22 09:11

David Schwartz