Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can std::bitset be faster than std::vector<bool>?

Tags:

c++

According to this answer the poster expects a std::bitset of size 100k bits to be faster than a std::vector<bool> when querying individual bits. How can this be possible?

How could they even differ significantly in their implementation, if std::bitset apparently allows for arbitrary sizes just like std::vector?

like image 651
Martin Ba Avatar asked Nov 11 '10 16:11

Martin Ba


People also ask

Is Bitset faster than vector bool?

So it seems under these conditions, bitset is faster than vector when the code is optimized, while vector actually comes out on top by a (very) small margin when it's not.

Is std :: Bitset efficient?

It's not like bitset is implemented very inefficiently for what it does. If you genuinely need to access a bunch of bits with a random access pattern which, for some reason or other, needs to check and set just one bit a time, then it might be ideally implemented for such a purpose.

Is std :: vector fast?

A std::vector can never be faster than an array, as it has (a pointer to the first element of) an array as one of its data members. But the difference in run-time speed is slim and absent in any non-trivial program. One reason for this myth to persist, are examples that compare raw arrays with mis-used std::vectors.

What is std :: Bitset?

template< std::size_t N > class bitset; The class template bitset represents a fixed-size sequence of N bits. Bitsets can be manipulated by standard logic operators and converted to and from strings and integers.


2 Answers

Measurements on Visual Studio 2010 show that std::bitset is not generally faster than std::vector<bool>. What the exact reason for this is I cannot say -- only that bitset is implemented significantly different from the std::vector full specialization.

std::bitset stores it's full content in the object via a

template<size_t _Bits>     class bitset .....      _Ty _Array[_Words + 1]; // the set of bits     }; 

array and that makes large bitset unsuitable to be put on the stack -- which isn't a performance argument per se.

vector<bool> doesn't suffer from the stack problem, and testing with a size of 1e6 and 1e7 it seems that on my box here querying values in a loop is actually 2x faster with a vector.

Well. I guess the usual timing caveats apply and YMMV, but here's the test code I used should anyone care to try himself:

The output on my box is:

1 vector<bool> loop with a size of 10000000 and 10 iterations*n: 11187 ms bitset<10000000> loop with 10 iterations*n: 22719 ms 101250010 Press any key to continue . . . 

BitMap.cpp

#include "stdafx.h" #include "BitMap.h"  using namespace std;  // Global var to prevent optimizer from messing things up volatile size_t ext;  volatile clock_t t1; volatile clock_t t2; double delta1; double delta2;  int main(int argc, _TCHAR* argv[]) {   ext = 1;   printf("%d\n", ext);    vb_t *const vec = new vb_t(bssz);   bs_t *const bits = new bs_t(); // must put large bitset on heap    const int iter = 10;   delta1=0;   delta2=0;   for(int o=0; o<5; ++o) {     t1 = clock();     for(int i=0; i!=5; ++i)       bs_loop(iter, *vec);     t2 = clock();     delta1 += t2-t1;     t1 = clock();     for(int i=0; i!=5; ++i)       bs_loop(iter, *bits);     t2 = clock();     delta2 += t2-t1;   }    delta1 /= CLOCKS_PER_SEC;   delta2 /= CLOCKS_PER_SEC;   delta1 *= 1000;   delta2 *= 1000;    cout << "vector<bool> loop with a size of " << bssz << " and " << iter << " iterations*n: " << delta1 << " ms\n";   cout << "bitset<" << bssz << "> loop with " << iter << " iterations*n: " << delta2 << " ms\n";    printf("%d\n", ext);   delete vec;   delete bits;   return 0; } 

BitMap.h

#pragma once #include <vector> #include <bitset>  extern volatile size_t ext; const size_t bssz = size_t(1e7); // 1e7 ca 10m  using namespace std; // Test code, using here is OK. typedef vector<bool> vb_t; typedef bitset<bssz> bs_t;  template<class COLL> void bs_loop(const int iterations, COLL const& v); 

bs_loop.cpp

#include "stdafx.h" #include "BitMap.h"  template<class COLL> void bs_loop(const int iterations, COLL const& v) {   ext = sizeof(COLL);   for(size_t i=0; i!=iterations; ++i) {     ++ext;     for(size_t j=0, e=v.size(); j!=e; ++j) {       if(v[j]) {         --ext;       }       else {         ++ext;       }     }   } }  template void bs_loop(const int iterations, vb_t const& v);  template void bs_loop(const int iterations, bs_t const& v); 

Compiler command line:

/Zi /nologo /W3 /WX- /O2 /Oi /Oy- /D "WIN32" /D "NDEBUG" /D "_CONSOLE" /D "_UNICODE" /D "UNICODE" /Gm- /EHsc /GS /Gy  /fp:precise /Zc:wchar_t /Zc:forScope /Yu"StdAfx.h" /Fp"Release\BitMap.pch"  /Fa"Release\" /Fo"Release\" /Fd"Release\vc100.pdb" /Gd /analyze-  /errorReport:queue  

note the /O2 and the missing /GL (no whole prg opt).

like image 174
Martin Ba Avatar answered Sep 28 '22 20:09

Martin Ba


Well, since I'm the guy you're basing this question on, here's where I got that idea from:

"…it packs the bools and stores them as individual bits (inside, say, chars) in its internal representation. One consequence of this is that it can't just return a normal bool& from its operator[] or its dereferenced iterators[2]; instead, it has to play games with a helper "proxy" class that is bool-like but is definitely not a bool. Unfortunately, that also means that access into a vector<bool> is slower, because we have to deal with proxies instead of direct pointers and references.

Bottom line: If you care more about speed than you do about size, you shouldn't use std::vector<bool>. Instead, you should hack around this optimization by using a std::vector<char> or the like instead, which is unfortunate but still the best you can do."

Or, as I recommended, if you know the biggest size that your set will get, use std::bitset.

like image 32
Steve M Avatar answered Sep 28 '22 19:09

Steve M