Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there any position limitation on the insert hint for std::map?

Tags:

c++

stl

I just found std::map.insert could accept an iterator as its first parameter as a search hint for the insertion process, such as map.insert(hintIterator, insertElement);. But is there any position requirement for the hint element? Does it need to be just before or after the insert position? By the way, how much effect does the hint iterator position have on the insertion efficiency?

like image 240
Thomson Avatar asked Dec 03 '10 03:12

Thomson


Video Answer


2 Answers

It can be anywhere from begin() to end(). If you pass in an iterator corresponding to the ideal insertion position you can massively improve the insertion cost does to searching for the correct location.

From the SGI STL site

Complexity guarantees

Insert with hint is logarithmic in general, but it is amortized constant time if t is inserted immediately before p.

To understand why this is, think about the algorithm that's being applied. std::map has items arranged in some order, and in order to find the correct insertion point the items must be walked through until it finds a position where one item ("A") must precede the new data and the next item ("B") must follow it. Given this location in advance the search can be eliminated.

The new data has to go between these two items, and update the links between them. At the very least (for forward-iterable containers) item A must be updated to point to the new data, which subsequently points to B. If the contained is reverse-iterable, B must also be updated to point to the new data.

How should the location be specified? One must know wither A or B. As Cubbi points out, and discussed on alt.comp.lang.learn.c-cpp the 2003 standard differs on what the hint should be. The SGI documentation suggests B is the one needed, while the standard suggests A. Possibly (given that std::map has bidriectional iterators) it doesn't matter. I would suggest, though, that the position of the lower item (A) would be best, since you can always expect to be able to continue searching forwards.

UPDATE: Since educated guesses are useless until validated, here is a quick test:

#include <ctime>
#include <map>
#include <iostream>

int main() {
    typedef std::map<unsigned int,unsigned int> map;

    const unsigned int k=100000;
    const unsigned int reps=10;

    // Avoid edge cases by padding either end
    map srcMap;
    {
        for(unsigned int i=0; i!=k;++i) {
            srcMap.insert(std::make_pair(i,i));
        }
        unsigned int l=3*k;
        for(unsigned int i=2*k; i!=l;++i) {
            srcMap.insert(std::make_pair(i,i));
        }
    }

    std::cout << "Hint is always the position of the preceding value\n";
    for(unsigned int i=0; i!=reps;++i)
    {
        map testMap(srcMap);
        map::iterator p=testMap.lower_bound(k-1);
        unsigned int l=2*k;
        std::clock_t start = std::clock();
        for(unsigned int i=k; i!=l;++i) {
            p=testMap.insert(p,std::make_pair(i,i));
        }
        std::clock_t end = std::clock();
        std::cout << static_cast<double>((end - start) ) << " ";
    }
    std::cout << std::endl;

    std::cout << "Hint is always the position of the following value\n";
    for(unsigned int i=0; i!=reps;++i)
    {
        map testMap(srcMap);
        map::iterator p=testMap.lower_bound(2*k);
        unsigned int l=k-1;
        std::clock_t start = std::clock();
        for(unsigned int i=2*k-1; i!=l;--i) {
            p=testMap.insert(p,std::make_pair(i,i));
        }
        std::clock_t end = std::clock();
        std::cout << static_cast<double>((end - start) ) << " ";
    }
    std::cout << std::endl;

    std::cout << "Hint is always the start of the container\n";
    for(unsigned int i=0; i!=reps;++i)
    {
        map testMap(srcMap);
        unsigned int l=2*k;
        std::clock_t start = std::clock();
        for(unsigned int i=k; i!=l;++i) {
            testMap.insert(testMap.begin(),std::make_pair(i,i));
        }
        std::clock_t end = std::clock();
        std::cout << static_cast<double>((end - start)) << " ";
    }
    std::cout << std::endl;

    std::cout << "No hint\n";
    for(unsigned int i=0; i!=reps;++i)
    {
        map testMap(srcMap);
        unsigned int l=2*k;
        std::clock_t start = std::clock();
        for(unsigned int i=k; i!=l;++i) {
            testMap.insert(std::make_pair(i,i));
        }
        std::clock_t end = std::clock();
        std::cout << static_cast<double>((end - start)) << " ";
    }
    std::cout << std::endl;

    return 0;
}

On my little netbook running MinGW GCC 4.5 I got:

Hint is always the position of the preceding value
94 109 109 109 109 109 110 110 110 94
Hint is always the position of the following value
109 94 109 94 110 110 109 109 110 110
Hint is always the start of the container
188 172 172 187 172 172 235 187 172 187
No hint
172 171 172 172 172 172 172 172 171 172

So I'd say hinting on either side gives about the same result, and is always better than no hint at all. Choosing a poor hint position (such as the start) is worse than no hint at all.

like image 193
beldaz Avatar answered Oct 04 '22 08:10

beldaz


From Unique Sorted Associative Container concept:

Insert with hint is logarithmic in general, but it is amortized constant time if t is inserted immediately before p.

And only this about the hint iterator:

The argument p is a hint: it points to the location where the search will begin.
like image 43
Nikolai Fetissov Avatar answered Oct 04 '22 08:10

Nikolai Fetissov