I am working on an external sorting algorithm that uses std::queue
and must carefully constrain its memory usage. I have noticed that during the merge phase (which uses several std::queue
s of fixed length), my memory usage increases to about 2.5X what I expected. Since std::queue
by default uses std::deque
as its underlying container, I ran some tests on std::deque
to determine its memory overhead. Here are the results, running on VC++ 9, in release mode, with a 64-bit process:
When adding 100,000,000 char
s to a std::deque
, the memory usage grows to 252,216K. Note that 100M char
s (1 byte) should occupy 97,656K, so this is an overhead of 154,560K.
I repeated the test with double
s (8 bytes) and saw memory grow to 1,976,676K, while 100M double
s should occupy 781,250K, for an overhead of 1,195,426K!!
Now I understand that std::deque
is normally implemented as a linked list of "chunks." If this is true, then why is the overhead proportional to the element size (because of course the pointer size should be fixed at 8 bytes)? And why is it so danged huge?
Can anybody shed some light on why std::deque
uses so much danged memory? I'm thinking I should switch my std::queue
underlying containers to std::vector
as there is no overhead (given that the appropriate size is reserve
ed). I'm thinking the benefits of std::deque
are largely negated by the fact that it has such a huge overhead (resulting in cache misses, page faults, etc.), and that the cost of copying std::vector
elements may be less, given that the overall memory usage is so much lower. Is this just a bad implementation of std::deque
by Microsoft?
For search, list is clearly slow where deque and vector have about the same performance. It seems that deque is faster than a vector for very large data sizes.
Vector provides good performance while insertion and deletion at end only and bad performance for insertion and deletion at middle. Deque provides same kind of performance as vector for insertion & deletion at end and middle. Apart from that deque provides good performance for insertion and deletion at front also.
Overall, for insertions, the vector and deque are the fastest for small types and the list is the fastest for the very large types.
As the quote from cppref states, std::deque uses multiple arrays to store the elements. Within each array the elements are contiguous.
Look at the code for _DEQUESIZ (number of elements per block):
#define _DEQUESIZ (sizeof (_Ty) <= 1 ? 16 \
: sizeof (_Ty) <= 2 ? 8 \
: sizeof (_Ty) <= 4 ? 4 \
: sizeof (_Ty) <= 8 ? 2 : 1) /* elements per block (a power of 2) */
It gets smaller if the element is larger. Only for elements larger than 8 bytes will you get the expected behavior (percentual decrease of overhead with increase of element size).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With