I'm trying to construct a set of unique words from a list of entries, each of which has a vector of strings.
So I made a function called Insert, which gets called for each of the entries like this:
for( auto & e : _Entries )
_Dictionary.Insert( begin( e.getNameWords( ) ), end( e.getNameWords( ) ) );
The class _Dictionary internally has a set (the STL container) and I wrote the function Insert as follows:
template< typename InputIterator >
void Insert( InputIterator first, InputIterator last )
{
for( auto it = first ; it != last ; ++it )
_AllWords.insert( *it );
}
In my case, calling Insert for all entries in _Entries took an average of 570 milliseconds.
Then I thought that I should use the functions that the STL already has to do the same thing that the for loop in Insert does, so I changed the function Insert to the following:
template< typename InputIterator >
void Insert( InputIterator first, InputIterator last )
{
copy( first, last, inserter( _AllWords, begin( _AllWords ) ) );
}
I was expecting this to
(guided by the philosophy of letting the STL do as much for you as you can). However, I was surprised to notice that this implementation actually took longer; not much more, but a consistent 200 milliseconds more than the previous for-loop based implementation.
I know this is an essentially trivial speed difference, but I'm still surprised.
So my question is: why is my implementation faster?
Note: I am compiling this with clang's version 3.5.2 with the libc++ standard library and with the -O3 flag, under Ubuntu 14.04.
The problem is this:
copy( first, last, inserter( _AllWords, begin( _AllWords ) ) );
ends up calling this version of insert
:
iterator insert( iterator hint, const value_type& value );
with begin()
as the hint. That is, typically, not where you're going to want to insert each value. As a result, you're just making the container do more work trying to figure out where to add your values since your hint
is as bad as possible.
But note that there is also this overload of insert
:
template< class InputIt >
void insert( InputIt first, InputIt last );
which you should just use†:
template< typename InputIterator >
void Insert( InputIterator first, InputIterator last )
{
_AllWords.insert(first, last);
}
And side-note, _AllWords
is a reserved identifier.
The overloads (5-6) are often implemented as a loop that calls the overload (3) with
end()
as the hint; they are optimized for appending a sorted sequence (such as another set) whose smallest element is greater than the last element in*this
That seems like a really specific goal to optimize against, which you may or may not satisfy, so probably you shouldn't use this overload, and your initial loop is just fine.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With