Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

std::remove with vector::erase and undefined behavior

All over the web I see people use the erase/remove idiom for C++ vectors like so:

#include <vector> // the general-purpose vector container #include <iostream> #include <algorithm> // remove and remove_if int main() {   // initialises a vector that holds the numbers from 0-9.   std::vector<int> v = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };    // removes all elements with the value 5   v.erase( std::remove( v.begin(), v.end(), 5 ), v.end() );    return 0; } 

That is, if I want to erase all elements matching some criteria (e.g. the number 5 from a vector of ints), then I use std::remove or std::remove_if in conjunction with vector.erase like so:

vector.erase( std::remove( vector.begin(), vector.end(), <some_value>), vector.end()); 

This works nicely in general; std::remove (and remove_if) will copy (or use move semantics in C++11) the elements that are to be deleted over to the end of the vector, so the vector from our previous example will now look like this:

{ 0, 1, 2, 3, 4, 6, 7, 8, 9, 5 };

With the element 5 bolded because it's been moved to the end.

Now, std::remove will return an iterator to it, which we then use in erase to clear the elements out. Nice.

But what about the following example?

int main() {   // initialises an empty vector.   std::vector<int> v = {};    // removes all elements with the value 5   v.erase( std::remove( v.begin(), v.end(), 5 ), v.end() );    return 0; } 

This seems to work as expected (not erasing anything, not segfaulting, etc.) on all platforms I run it on, but I know that just because something is working, doesn't mean it's not undefined behavior.

The quick reference for vector.erase says this (emphasis mine):

iterator erase (const_iterator first, const_iterator last); 

first, last are

Iterators specifying a range within the vector] to be removed: [first,last). i.e., the range includes all the elements between first and last, including the element pointed by first but not the one pointed by last. Member types iterator and const_iterator are random access iterator types that point to elements.

So is vector.erase(vector.end(),vector.end()) undefined behavior?

Here's what the quick reference says about exception safety:

If the removed elements include the last element in the container, no exceptions are thrown (no-throw guarantee). Otherwise, the container is guaranteed to end in a valid state (basic guarantee). An invalid position or range causes undefined behavior.

So, the answer, at least to me appears to be "YES", and this StackOverflow answer seems to support it.

Therefore, is the common idiom wrong?

Assuming it's undefined behavior, then any call to remove could return an iterator to vector.end() which should be checked before calling vector.erase, and calling remove on an empty vector does seem to return vector.end: (IDEOne for code below)

#include <iostream> #include <algorithm> #include <vector> using namespace std;  int main() {    vector<int> myInts;    auto anIter = std::remove(myInts.begin(),myInts.end(),5);    if (anIter == myInts.end())       std::cout << "iterator = myInts.end()"; } 

Finally, my question:

Should the actual remove/erase idiom be this?

auto endOfRangeIterator = std::remove(vector.begin(), vector.end(), <value>); if (endOfRangeIterator != vector.end())    vector.erase(endOfRangeIterator, vector.end()) 
like image 332
AndyG Avatar asked May 20 '14 13:05

AndyG


People also ask

How do I erase an element from std::vector <> by value?

You need to use std::remove algorithm to move the element to be erased to the end of the vector and then use erase function. Something like: myVector. erase(std::remove(myVector. begin(), myVector.

Does vector erase delete object?

Yes. vector::erase destroys the removed object, which involves calling its destructor.


1 Answers

24.2.1/7 Most of the library’s algorithmic templates that operate on data structures have interfaces that use ranges. A range is a pair of iterators that designate the beginning and end of the computation. A range [i,i) is an empty range; in general, a range [i,j) refers to the elements in the data structure starting with the element pointed to by i and up to but not including the element pointed to by j.

Emphasis mine.

Further, the description of erase you cite is not the normative text in the standard. The standard has this to say (Table 100):

a.erase(q1,q2)

Effects: Erases the elements in the range [q1, q2).

This doesn't require that q1 be dereferenceable. If [q1, q2) is an empty range (per 24.2.1/7), then no elements are in the range, and so none are erased.

like image 123
Igor Tandetnik Avatar answered Sep 17 '22 13:09

Igor Tandetnik