Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does the Standard deal with self-referencing iterators in container insert functions?

Tags:

c++

When dealing with C++ Standard Library containers like std::vector, how do their range-based insertion methods handle the user using iterators that reference the vector's own contents?

Presumably if they are say, vector::iterator already, then the implementation can special-case this scenario, but if they are a user-defined type which eventually results in accessing the vector, how does the vector deal with keeping those iterators valid whilst evaluating the range? Does the Standard simply ban referencing the vector in the range?

For a simple example, consider an iterator whose value_type is size_t, and the result of de-referencing it is the size of the vector being inserted into.

struct silly_iterator {
    vector<std::size_t>* v;
    unsigned number;
    std::size_t operator*() { return v->size(); }
    operator++() { --number; }
    bool operator==(silly_iterator other) const { return number == 0; }
    // other methods
};
std::vector<std::size_t> vec = { 3, 4, 5, 6, 7 };
vector.insert(vector.begin() + 2, silly_iterator(&vec, 10), silly_iterator());

What is the contents of vec after this code has executed?

For another example,

struct silly_iterator { 
    std::vector<std::size_t>* v; 
    std::size_t operator*() { return 0; } 
    operator++() { --number; v->push_back((*v)[4]); } 
    bool operator==(silly_iterator other) const { return number == 0; } 
    // other methods 
}; 
std::vector<std::size_t> vec = { 3, 4, 5, 6, 7 }; 
vec.insert(vec.begin() + 2, silly_iterator(&vec, 10), silly_iterator());
like image 204
Puppy Avatar asked May 12 '13 14:05

Puppy


2 Answers

In C++11 n3242 (which is slightly old draft) in 23.2.3 table 100 we learn that for the iterator pair insert function, pre: i and j are not iterators into a. I believe based on that wording I would choose to broadly interpret that as i and j shall not access a, and that both of your iterators are undefined behavior.

But let's say that my broad interpretation is not what was intended by the standard. Then to answer your question, for input iterators the result is almost certainly going to be: 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 5, 6, 7 while for forward iterators or better one of 3, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 6, 7 or 3, 4, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 5, 6, 7 depending on whether the size is updated before or after the elements are copied into the open space. I don't see anywhere that the results for forward iterators in such a case would be specified.

like image 156
Mark B Avatar answered Oct 03 '22 07:10

Mark B


Basically, the only reason to invalidate iterators/references for vector is a reallocation, since otherwise you are still pointing at some part of your vector.

C++11 23.3.6.3/5:

Remarks: Reallocation invalidates all the references, pointers, and iterators referring to the elements in the sequence. It is guaranteed that no reallocation takes place during insertions that happen after a call to reserve() until the time when an insertion would make the size of the vector greater than the value of capacity().

This is again reiterated in the remarks to the insert functions C++11 23.3.6.5/1:

Remarks: Causes reallocation if the new size is greater than the old capacity. If no reallocation happens, all the iterators and references before the insertion point remain valid. [...]

With vector, you can consider your iterators to behave very much like pointers (which shows why exactly reallocation will cause problems). In fact, the reference type is value_type&, as defined by the standard, showing that references are indeed not even wrapped.

Note, that the iterators' target may change by the insertion, since the underlying data changes. Also, to be standards compliant, you would need to ensure that reallocation does not happen (e.g. with a call to reserve).

like image 21
gha.st Avatar answered Oct 03 '22 07:10

gha.st