I'm looking at the implementation of <code>std::vector</code> in libc++ and I noticed that it internally keeps three pointers (one to the begin, one the end, and one to the end of the allocated memory) instead of what I'd instinctively do, i.e., one pointer to the begin and two <code>size</code> and <code>capacity</code> members. Here is the code from libc++'s <code><vector></code> (ignore the compressed pair, I know what it means). <pre class="prettyprint"><code>pointer __begin_; pointer __end_; __compressed_pair<pointer, allocator_type> __end_cap_; </code></pre> I noticed that also other standard libraries do the same (e.g. Visual C++). I don't see any particular reason why this solution should be faster than the other one, but I might be wrong. So is there a particular reason the "three pointers" solution is preferred to the "pointer + sizes" one?

It's because the rationale is that performance should be optimized for iterators, not indices. (In other words, performance should be optimized for <code>begin()</code>/<code>end()</code>, not <code>size()</code>/<code>operator[]</code>.) Why? Because iterators are generalized pointers, and thus C++ encourages their use, and in return ensures that their performance matches those of raw pointers when the two are equivalent. To see why it's a performance issue, notice that the typical <code>for</code> loop is as follows: <pre class="prettyprint"><code>for (It i = items.begin(); i != items.end(); ++i) ... </code></pre> Except in the most trivial cases, if we kept track of sizes instead of pointers, what would happen is that the comparison <code>i != items.end()</code> would turn into <code>i != items.begin() + items.size()</code>, taking more instructions than you'd expect. (The optimizer generally has a hard time factoring out the code in many cases.) This slows things down dramatically in a tight loop, and hence this design is avoided. (I've verified this is a performance problem when trying to write my own replacement for <code>std::vector</code>.) <hr> Edit: As Yakk pointed out in the comments, using indices instead of pointers can also result in the generation of a multiplication instruction when the element sizes aren't powers of 2, which is pretty expensive and noticeable in a tight loop. I didn't think of this when writing this answer, but it's a phenomenon that's bitten me before (e.g. see here)... bottom line is, in a tight loop everything matters.

It's more convenient for implementers. Storing size makes exactly one operation easier to implement: <code>size()</code> <pre class="prettyprint"><code>size_t size() { return size_; } </code></pre> on the other hand, it makes other harder to write and makes reusing code harder: <pre class="prettyprint"><code>iterator end() { return iterator(end_); } // range version iterator end() { return iterator(begin_ + size_); } // pointer + size version void push_back(const T& v) // range version { // assume only the case where there is enough capacity ::new(static_cast<void*>(end_)) T(v); ++end_; } void push_back(const T& v) // pointer + size version { // assume only the case where there is enough capacity ::new(static_cast<void*>(begin_ + size_)) T(v); // it could use some internal `get_end` function, but the point stil stands: // we need to get to the end ++size_; } </code></pre> If we have to find the end anyway, we could store it directly - it's more useful than size anyway.

Why the libc++ std::vector internally keeps three pointers instead of one pointer and two sizes?

Tags:

I'm looking at the implementation of std::vector in libc++ and I noticed that it internally keeps three pointers (one to the begin, one the end, and one to the end of the allocated memory) instead of what I'd instinctively do, i.e., one pointer to the begin and two size and capacity members.

Here is the code from libc++'s <vector> (ignore the compressed pair, I know what it means).

pointer                                    __begin_; pointer                                    __end_; __compressed_pair<pointer, allocator_type> __end_cap_;

I noticed that also other standard libraries do the same (e.g. Visual C++). I don't see any particular reason why this solution should be faster than the other one, but I might be wrong.

So is there a particular reason the "three pointers" solution is preferred to the "pointer + sizes" one?

499

asked May 24 '15 09:05

gigabytes

2 Answers

It's because the rationale is that performance should be optimized for iterators, not indices.
(In other words, performance should be optimized for begin()/end(), not size()/operator[].)
Why? Because iterators are generalized pointers, and thus C++ encourages their use, and in return ensures that their performance matches those of raw pointers when the two are equivalent.

To see why it's a performance issue, notice that the typical for loop is as follows:

for (It i = items.begin(); i != items.end(); ++i)     ...

Except in the most trivial cases, if we kept track of sizes instead of pointers, what would happen is that the comparison i != items.end() would turn into i != items.begin() + items.size(), taking more instructions than you'd expect. (The optimizer generally has a hard time factoring out the code in many cases.) This slows things down dramatically in a tight loop, and hence this design is avoided.

(I've verified this is a performance problem when trying to write my own replacement for std::vector.)

Edit: As Yakk pointed out in the comments, using indices instead of pointers can also result in the generation of a multiplication instruction when the element sizes aren't powers of 2, which is pretty expensive and noticeable in a tight loop. I didn't think of this when writing this answer, but it's a phenomenon that's bitten me before (e.g. see here)... bottom line is, in a tight loop everything matters.

150

answered Oct 21 '22 17:10

user541686

It's more convenient for implementers.

Storing size makes exactly one operation easier to implement: size()

size_t size() { return size_; }

on the other hand, it makes other harder to write and makes reusing code harder:

iterator end() { return iterator(end_); } // range version iterator end() { return iterator(begin_ + size_); } // pointer + size version  void push_back(const T& v) // range version {     // assume only the case where there is enough capacity     ::new(static_cast<void*>(end_)) T(v);     ++end_; }  void push_back(const T& v) // pointer + size version {     // assume only the case where there is enough capacity     ::new(static_cast<void*>(begin_ + size_)) T(v);     // it could use some internal `get_end` function, but the point stil stands:     // we need to get to the end     ++size_; }

If we have to find the end anyway, we could store it directly - it's more useful than size anyway.

answered Oct 21 '22 17:10

milleniumbug

Related questions
                            
                                plot mixed effects model in ggplot
                            
                                How to ignore branch coverage for missing 'else'
                            
                                Trigger element (XAML) is not supported in a UWP project
                            
                                offline document for go/golang
                            
                                Use javascript to get a random image from Google images
                            
                                Unable to create new Blank App (Android) in Visual Studio 2015
                            
                                xamarin for visual studio not showing simulator list
                            
                                SQLAlchemy: eager loading of more than one relationship
                            
                                Function changes const object
                            
                                Pandas DataFrame from Dictionary with Lists
                            
                                Force ggplot legend to show all categories when no values are present [duplicate]
                            
                                CMake: how to specify the version of Visual C++ to work with?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With