Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

std::inserter with set - insert to begin() or end()? [duplicate]

I have some code that looks like this:

std::set<int> s1, s2, out;

// ... s1 and s2 are populated ...

std::set_intersection(s1.begin(), s1.end(),
                      s2.begin(), s2.end(),
                      std::inserter(out, out.end()));

I've read inserts can be done in amortized constant time if the value being inserted to the set immediately follows the iterator given as a "hint". This would obviously be beneficial when running the set intersection, especially since everything being written to out is already in sorted order.

How do I guarantee this optimal performance? When creating the std::inserter, out is empty so out.begin() == out.end() so I can't see it makes any difference whether I specify out.begin() or out.end() as the hint. However, if this is interpreted at inserting every element at begin(), it doesn't seem that I would get the optimum algorithmic performance. Can this be done better?

like image 934
AshleysBrain Avatar asked Aug 31 '10 14:08

AshleysBrain


People also ask

What happens when we insert duplicate values in set C++?

In Set duplicate values are not allowed to get stored. On other hand in case of MultiSet we can store duplicate values. In case of Set, one cannot change the value once it gets inserted however we can delete or insert it again. However in case of MultiSet also we cannot change the value once get inserted.

Can a set contain duplicates CPP?

A set is a collection of unordered unique values. Therefore, it cannot contain a duplicate element.

What is the time complexity of set :: insert?

Its time complexity is O(logN) where N is the size of the set. insert(): insert a new element. Its time complexity is O(logN) where N is the size of the set. size(): Returns the size of the set or the number of elements in the set.

How do you append sets in C++?

insert() function is an inbuilt function in C++ STL, which is defined in <set> header file. This function is used to insert elements in the set container. when we insert the element the size of the container is increased by the number of the elements inserted.


3 Answers

I've chosen Alexander Gessler's answer as the 'correct' answer, because it led me to this solution, which I thought I would post anyway. I've written a last_inserter(), which guarantees that the insert position is always an iterator to the last element (or begin() if empty), because set wants an iterator to the element preceding the actual insert position for best performance (so not end() - that would be one after the actual insert position).

The usage as per the original example is like this:

std::set<int> s1, s2, out;

// ... s1 and s2 are populated ...

std::set_intersection(s1.begin(), s1.end(),
                      s2.begin(), s2.end(),
                      last_inserter(out));  // note no iterator provided

This guarantees that the insert hint is always an iterator to the last element, hopefully providing best-case performance when using an output iterator to a set with a sorted range, as above.

Below is my implementation. I think it's platform specific to Visual C++ 2010's STL implementation, because it's based heavily on the existing insert_iterator, and I can only get it working by deriving from std::_Outit. If anyone knows how to make this portable, let me know:

// VC10 STL wants this to be a checked output iterator.  I haven't written one, but
// this needs to be defined to silence warnings about this.
#define _SCL_SECURE_NO_WARNINGS

template<class Container>
class last_inserter_iterator : public std::_Outit {
public:
    typedef last_inserter_iterator<Container> _Myt;
    typedef Container container_type;
    typedef typename Container::const_reference const_reference;
    typedef typename Container::value_type _Valty;

    last_inserter_iterator(Container& cont)
        : container(cont)
    {
    }

    _Myt& operator=(const _Valty& _Val)
    {
        container.insert(get_insert_hint(), _Val);
        return (*this);
    }

    _Myt& operator=(_Valty&& _Val)
    {
        container.insert(get_insert_hint(), std::forward<_Valty>(_Val));
        return (*this);
    }

    _Myt& operator*()
    {
        return (*this);
    }

    _Myt& operator++()
    {
        return (*this);
    }

    _Myt& operator++(int)
    {
        return (*this);
    }

protected:
    Container& container;

    typename Container::iterator get_insert_hint() const
    {
        // Container is empty: no last element to insert ahead of; just insert at begin.
        if (container.empty())
            return container.begin();
        else
        {
            // Otherwise return iterator to last element in the container.  std::set wants the
            // element *preceding* the insert position as a hint, so this should be an iterator
            // to the last actual element, not end().
            return (--container.end());
        }
    }
};

template<typename Container>
inline last_inserter_iterator<Container> last_inserter(Container& cont)
{
    return last_inserter_iterator<Container>(cont);
}
like image 164
AshleysBrain Avatar answered Sep 19 '22 07:09

AshleysBrain


You could use a custom functor instead of std::inserter and re-call out.end() every time a new element is inserted.

Alternatively, if your values are sorted descendingly, out.begin() will be fine.

like image 37
Alexander Gessler Avatar answered Sep 19 '22 07:09

Alexander Gessler


According to http://gcc.gnu.org/onlinedocs/gcc-4.8.0/libstdc++/api/a01553_source.html

insert_iterator&
operator=(typename _Container::value_type&& __value)
{
  iter = container->insert(iter, std::move(__value));
  ++iter;
  return *this;
}

Where iter originally pointed to the iterator you passed to std::inserter. So iter will always point to one past the value you just inserted and if you're inserting in order, should be optimally efficient.

like image 45
robson Avatar answered Sep 21 '22 07:09

robson