Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Performance gap between vector<bool> and array

I was trying to solve a coding problem in C++ which counts the number of prime numbers less than a non-negative number n.

So I first came up with some code:

int countPrimes(int n) {
    vector<bool> flag(n+1,1);
    for(int i =2;i<n;i++)
    {
        if(flag[i]==1)
            for(long j=i;i*j<n;j++)
                flag[i*j]=0;
    }
    int result=0;
    for(int i =2;i<n;i++)
        result+=flag[i];
    return result;
}

which takes 88 ms and uses 8.6 MB of memory. Then I changed my code into:

int countPrimes(int n) {
    // vector<bool> flag(n+1,1);
    bool flag[n+1] ;
    fill(flag,flag+n+1,true);
    for(int i =2;i<n;i++)
    {
        if(flag[i]==1)
            for(long j=i;i*j<n;j++)
                flag[i*j]=0;
    }
    int result=0;
    for(int i =2;i<n;i++)
        result+=flag[i];
    return result;
}

which takes 28 ms and 9.9 MB. I don't really understand why there is such a performance gap in both the running time and memory consumption. I have read relative questions like this one and that one but I am still confused.

EDIT: I reduced the running time to 40 ms with 11.5 MB of memory after replacing vector<bool> with vector<char>.

like image 797
Qin Heyang Avatar asked Apr 18 '19 11:04

Qin Heyang


People also ask

Which is faster vector or array?

A std::vector can never be faster than an array, as it has (a pointer to the first element of) an array as one of its data members. But the difference in run-time speed is slim and absent in any non-trivial program. One reason for this myth to persist, are examples that compare raw arrays with mis-used std::vectors.

How much slower are vectors than arrays?

The time difference is 0.333 seconds. Or a difference of 0.000000000333 per array access. Assuming a 2.33 GHz Processor like mine that's 0.7 instruction pipeline stages per array accesses. So the vector looks like it is using one extra instruction per accesses.

Is Vector better than array C++?

Vector is better for frequent insertion and deletion, whereas Arrays are much better suited for frequent access of elements scenario. Vector occupies much more memory in exchange for managing storage and growing dynamically, whereas Arrays are a memory-efficient data structure.


3 Answers

std::vector<bool> isn't like any other vector. The documentation says:

std::vector<bool> is a possibly space-efficient specialization of std::vector for the type bool.

That's why it may use up less memory than an array, because it might represent multiple boolean values with one byte, like a bitset. It also explains the performance difference, since accessing it isn't as simple anymore. According to the documentation, it doesn't even have to store it as a contiguous array.

like image 131
Blaze Avatar answered Oct 24 '22 00:10

Blaze


std::vector<bool> is special case. It is specialized template. Each value is stored in single bit, so bit operations are needed. This memory compact but has couple drawbacks (like no way to have a pointer to bool inside this container).

Now bool flag[n+1]; compiler will usually allocate same memory in same manner as for char flag[n+1]; and it will do that on stack, not on heap.

Now depending on page sizes, cache misses and i values one can be faster then other. It is hard to predict (for small n array will be faster, but for larger n result may change).

As an interesting experiment you can change std::vector<bool> to std::vector<char>. In this case you will have similar memory mapping as in case of array, but it will be located at heap not a stack.

like image 16
Marek R Avatar answered Oct 24 '22 00:10

Marek R


I'd like to add some remarks to the good answers already posted.

  • The performance differences between std::vector<bool> and std::vector<char> may vary (a lot) between different library implementations and different sizes of the vectors.

    See e.g. those quick benches: clang++ / libc++(LLVM) vs. g++ / libstdc++(GNU).

  • This: bool flag[n+1]; declares a Variable Length Array, which (despites some performance advantages due to it beeing allocated in the stack) has never been part of the C++ standard, even if provided as an extension by some (C99 compliant) compilers.

  • Another way to increase the performances could be to reduce the amount of calculations (and memory occupation) by considering only the odd numbers, given that all the primes except for 2 are odd.

If you can bare the less readable code, you could try to profile the following snippet.

int countPrimes(int n)
{
    if ( n < 2 )
        return 0;
    // Sieve starting from 3 up to n, the number of odd number between 3 and n are
    int sieve_size = n / 2 - 1;
    std::vector<char> sieve(sieve_size); 
    int result = 1;  // 2 is a prime.

    for (int i = 0; i < sieve_size; ++i)
    {
        if ( sieve[i] == 0 )
        {
            // It's a prime, no need to scan the vector again
            ++result;
            // Some ugly transformations are needed, here
            int prime = i * 2 + 3;
            for ( int j = prime * 3, k = prime * 2; j <= n; j += k)
                sieve[j / 2 - 1] = 1;
        }
    }

    return result;
}

Edit

As Peter Cordes noted in the comments, using an unsigned type for the variable j

the compiler can implement j/2 as cheaply as possible. C signed division by a power of 2 has different rounding semantics (for negative dividends) than a right shift, and compilers don't always propagate value-range proofs sufficiently to prove that j will always be non-negative.

It's also possible to reduce the number of candidates exploiting the fact that all primes (past 2 and 3) are one below or above a multiple of 6.

like image 6
Bob__ Avatar answered Oct 24 '22 01:10

Bob__