Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does Microsoft std::vector::insert use rotate()?

I was doing some experiments with lists vs vectors, and I noticed that the Microsoft implementation of std::vector is doing the following for .insert:

iterator insert(const_iterator _Where, _Ty&& _Val)
    {   // insert by moving _Val at _Where
    return (emplace(_Where, _STD move(_Val)));
    }

    iterator emplace(const_iterator _Where \
        COMMA LIST(_TYPE_REFREF_ARG)) \
    {   /* insert by moving _Val at _Where */ \
    size_type _Off = _VIPTR(_Where) - this->_Myfirst; \
    _VECTOR_EMPLACE_CHECK \
    emplace_back(LIST(_FORWARD_ARG)); \
    _STD rotate(begin() + _Off, end() - 1, end()); \
    return (begin() + _Off); \
    }

I could not figure out what rotate does in vs2012, but in 2015 does this:

template<class _RanIt> inline
    _RanIt _Rotate(_RanIt _First, _RanIt _Mid, _RanIt _Last,
        random_access_iterator_tag)
    {   // rotate [_First, _Last), random-access iterators
    _STD reverse(_First, _Mid);
    _STD reverse(_Mid, _Last);
    _STD reverse(_First, _Last);
    return (_First + (_Last - _Mid));
    }

// TEMPLATE FUNCTION reverse
template<class _BidIt> inline
    void _Reverse(_BidIt _First, _BidIt _Last, bidirectional_iterator_tag)
    {   // reverse elements in [_First, _Last), bidirectional iterators
    for (; _First != _Last && _First != --_Last; ++_First)
        _STD iter_swap(_First, _Last);
    }

This is not an optimal way to traverse memory if we think about our cache.

I did some benchmarks, where I held the element in a temporary and used that to swap elements and it was faster: This was it:

push_back(value); //My vector doesn't have resize/grow implemented
T tmp = *(end() - 1);
while(new_location != end())
{
    std::swap(tmp, *new_location);
    new_location++;
}

The full code and tests are here.

First question:

Why does it do that rotate instead of the second version of insert that I presented here?

The second version is a more cache-friendly alternative compared with the first one. For large vectors swapping with the last element in the vector introduces a time penalty due to the cache.

Is it in order to avoid storing another temporary?

Second question:

Why doesn't it just memmove the elements one position to the right?

Is there a standard requirement that forces you to swap elements instead of calling memmove? It's interesting that for POD there is not a special template specialization that just memmoves stuff around. In any case I am more interested in why rotate and not a more cache friendly alternative is used.

In my tests this is even faster than the previous two versions.

Tests are done like this:

0) for i = 0 to count

1) Pick a random position in a vector

2) Touch each element from 0 to that position (force a read on it)

3) Call insert after we reach the position

Compiled with Visual Studio 2012 x86, /O2.

For count = 100 000, element size = 4 bytes:

std::vector:                7.5 seconds   
std::list:                 19.6 seconds                            
MyVector:                   3.2 seconds                              
MyVector using memmove:     2.1 seconds

For count = 200 000, element size = 4 bytes:
std::vector:                30.3 seconds                          
std::list:                  45.5 seconds                          
MyVector:                   13.1 seconds
MyVector using memmove:      8.7 seconds   

For count = 20 000, element size = 128 bytes:
std::vector:                5.36 seconds
std::list:                  1.37 seconds
MyVector:                   5.12 seconds
MyVector (memmove)          1.68 seconds

I know this is not a real life thing that you would do, these were some experiments that I did in order to show that cache matters, and I accidentally discovered the way that std vector insert works.

Also I know MyVector is a bad implementation of a vector. I just quickly wrote it in order to test my assumptions for insert. I just want to discuss the insert() implementation, not Vector class design :).

Thanks for reading this

like image 691
Alex Avatar asked Feb 03 '16 11:02

Alex


1 Answers

It turns out that there is no particular reason to call rotate in std::vector::insert.

I'll paste here the rotate implementation for visual studio 2015 that is used in insert():

template<class _RanIt> inline
    _RanIt _Rotate(_RanIt _First, _RanIt _Mid, _RanIt _Last,
        random_access_iterator_tag)
    {   // rotate [_First, _Last), random-access iterators
    _STD reverse(_First, _Mid);
    _STD reverse(_Mid, _Last);
    _STD reverse(_First, _Last);
    return (_First + (_Last - _Mid));
    }

// TEMPLATE FUNCTION reverse
template<class _BidIt> inline
    void _Reverse(_BidIt _First, _BidIt _Last, bidirectional_iterator_tag)
    {   // reverse elements in [_First, _Last), bidirectional iterators
    for (; _First != _Last && _First != --_Last; ++_First)
        _STD iter_swap(_First, _Last);
    }

A more cache friendly implementation will increase the speed of this method (vector::insert).

I know because people from Microsoft's STL are aware of the issue :)

https://twitter.com/StephanTLavavej/status/695013465342083072

like image 85
Alex Avatar answered Oct 12 '22 10:10

Alex