I can't understand what is going on in this snippet.
The map reference states "If the container is empty, the returned iterator value shall not be dereferenced."
But what about some_map->begin()->second
? on an empty map.
I thought it would be invalid, but this code prints '0'. Can anyone exaplin why?
int main()
{
map<int,int> a;
printf("%d",a.begin()->second);
return 1;
}
Thanks!
From this std::map::begin
reference
If the container is empty, the returned iterator will be equal to
end()
Then looking at this std::map::end
reference:
This element acts as a placeholder; attempting to access it results in undefined behavior.
[Emphasis mine]
You are experiencing something called undefined behavior and that's really all there is to say about it. Just don't do anything stupid like that.
As others have pointed out, your code has undefined behavior, and printing 0 is consistent with undefined behavior – as is any other behavior.
However, you may be interested in why this particular behavior occurs, as opposed to, e.g., a segfault. A likely reason is that your map
implementation uses dummy nodes. These are actually quite popular for implementations of node-based containers in general.
For example, a map
iterator may be a thin wrapper around a node pointer (e.g. to a node in a balanced binary search tree). Now, the obvious way to implement the end iterator is to use a null pointer here. However, it then becomes impossible to implement operator --()
on the end iterator without e.g. a pointer to the container as well – and, worse, an additional branch, because now you must check in every call to this operator whether the node pointer is null (I'm abbreviating the code example a bit, by leaving out irrelevant template parameters):
template<typename U, typename V>
typename map<U,V>::iterator &map<U,V>::iterator::operator --()
{
if ( node_ != nullptr )
{
// go to predecessor
if ( node_->leftChild_ != nullptr )
{
for ( node_ = node_->leftChild_; node_->rightChild_ != nullptr; node_ = node_->rightChild_ );
}
else
{
node *n;
do
{
n = node_;
node_ = node->parent_;
// may not go before begin()
assert( node_ != nullptr );
} while ( n == node_->leftChild_ );
}
}
else
{
// point to last node - map must not be empty
assert( container_->root_ != nullptr );
for ( node_ = container_->root_; node_->rightChild_ != nullptr; node = node->rightChild_ );
}
return *this;
}
However, if you have a dummy node that is always the rightmost node in the tree, and implement end iterators as wrappers around a pointer to the dummy node, the nullness check and therefore the second branch become redundant and can be removed. Likewise, the use of the container_
pointer becomes completely unnecessary, and so the pointer itself can be removed from the iterator
, saving space and reducing the cost to make and copy iterators. (Actually, since C++11 use of this "container pointer" is not even allowed anymore, because containers may be moved, and this does not invalidate the iterators, by definition. A valid solution would be more complex still. MINOR UPDATE: This may have been forbidden always because containers could be swapped without invalidating iterators.)
Now, this explains why an end()
iterator may actually point to "real" memory. And so, if operator *()
is implemented as:
template<typename U, typename V>
typename map<U,V>::reference map<U,V>::iterator::operator *() const
{
// cannot dereference end iterator
assert( hasSuccessor(node_) );
return *node_;
}
As above, I added a debugging assertion here. How is hasSuccessor
implemented? It actually has to ascend all the way to the top, in the worst case, to see if node_
or any of its ancestors has a right child. This check has prohibitive run time cost – i.e. it is literally forbidden by the standard. While its average complexity is only O(1), it has a worst case complexity of O(log N), but the standard requires Θ(1) even in the worst case for this operation. Granted, in a debug build, who cares, and there are other ways to implement the check, such as having a "dummy bit" somewhere in the node. The main point is that no such check is required anyway. Therefore, your attempt to dereferencing the iterator may give you a reference to the dummy node, and since it is "real" allocated memory, you can then proceed to read the mapped value. Note that this does not make the iterator "dereferencable" as far as the standard is concerned, it just means you get neither a segfault nor a program termination as an implementation detail that may change at any time without notice.
Now, there is possibly one more question as to why you get zero, as opposed to, let's say, -135, or some "random" 8-9 digit number for that matter. map
is not allowed to invoke any default constructor, for example, that has side effects. Unless you use operator []
, it is not even allowed to assume that your mapped type is default constructible. However, there are many other reasons why you may get a neat zero:
int
, map
is indeed free to initialize them as in int{}
, e.g. via placement new
.memset
the dummy node to zeroes before use. This could be the simplest way to ensure that the pointers are zero.memset
the whole page(s) to zeroes before letting the process access them.If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With