Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Strange std::map behaviour

The following test program

#include <map>
#include <iostream>

using namespace std;

int main(int argc, char **argv)
{
    map<int,int> a;
    a[1]=a.size();
    for(map<int,int>::const_iterator it=a.begin(); it!=a.end(); ++it)
            cout << "first " << (*it).first << " second " << (*it).second << endl;
}

leads to different output when compiled on g++ 4.8.1 (Ubuntu 12.04 LTS):

g++ xxx.cpp 
./a.out 
first 1 second 1

and on Visual Studio 2012 (Windows 7) (Standard Win32 Console Application Project):

ConsoleApplication1.exe
first 1 second 0

Which compiler is right? Am I doing something wrong?

like image 931
Wald Avatar asked Jan 15 '14 09:01

Wald


3 Answers

This is actually a well-formed program that has two equally valid execution paths, so both compilers are right.

a[1] = a.size()

In this expression, the evaluation of the two operands of = are unsequenced.

§1.9/15 [intro.execution] Except where noted, evaluations of operands of individual operators and of subexpressions of individual expressions are unsequenced.

However, function calls are not interleaved, so the calls to operator[] and size are actually indeterminately sequenced, rather than unsequenced.

§1.9/15 [intro.execution] Every evaluation in the calling function (including other function calls) that is not otherwise specifically sequenced before or after the execution of the body of the called function is indeterminately sequenced with respect to the execution of the called function.

This means that the function calls may happen in one of two orders:

  1. operator[] then size
  2. size then operator[]

If a key doesn't exist and you call operator[] with that key, it will be added to the map, thereby changing the size of the map. So in the first case, the key will be added, the size will be retrieved (which is 1 now), and 1 will be assigned to that key. In the second case, the size will be retrieved (which is 0), the key will be added, and 0 will be assigned to that key.

Note, this is not a situation that brings about undefined behaviour. Undefined behaviour occurs when two modifications or a modification and a read of the same scalar object are unsequenced.

§1.9/15 [intro.execution] If a side effect on a scalar object is unsequenced relative to either another side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined.

In this situation, they are not unsequenced but indeterminately sequenced.

So what we do have is two equally valid orderings of the execution of the program. Either could happen and both give valid output. This is unspecified behaviour.

§1.3.25 [defns.unspecified]
unspecified behavior
behavior, for a well-formed program construct and correct data, that depends on the implementation


So to answer your questions:

Which compiler is right?

Both of them are.

Am I doing something wrong?

Probably. It's unlikely that you would want to write code that has two execution paths like this. Unspecified behaviour can be okay, unlike undefined behaviour, because it can be resolved to a single observable output, but it's not worth having in the first place if you can avoid it. Instead, don't write code that has this kind of ambiguity. Depending on what exactly you want correct path to be, you can do either of the following:

auto size = a.size();
a[1] = size; // value is 0

Or:

a[1];
a[1] = a.size(); // value is 1

If you want the result to be 1 and you know the key doesn't yet exist, you could of course do the first code but assign size + 1.

like image 137
Joseph Mansfield Avatar answered Oct 18 '22 00:10

Joseph Mansfield


In this case, where a[1] returns a primitive type, please refer to this answer. In the case in which the std::map's value type is an user defined type and operator=(T, std::size_t) is defined for that type, the expression:

a[1] = a.size();

can be converted to the corresponding less-syntactic-sugar version:

a[1] = a.size();
a.operator[](1) = a.size();
operator=(a.operator[](1), a.size());

And, as we all know from the §8.3.6/9:

The order of evaluation of function arguments is unspecified.

which leads to the fact that the result of the above expression is unspecified.

We have, of course, two cases:

  • If the a.operator[](1) is evaluated first, the size of the map is incremented by 1 leading to the first output (first 1 second 1).
  • If the a.size() is evaluated first, the output you'll get is the second one (first 1 second 0).
like image 20
Shoe Avatar answered Oct 18 '22 01:10

Shoe


This is known as a sequence-point issue which means certain operations may be performed in any order chosen by the compiler.

If one has side-effects on the other, it is called "unspecified behaviour" a bit like "undefined behaviour" however where the result must be one of a fixed subset of outcomes, so here it must be either 0 or 1 and can't be any other value. In reality you should usually avoid doing it.

In your particular case. performing operator [] on a map changes its size (if that element does not yet exist). Thus it has a side effect on the right hand side of what it is assigning to it.

like image 37
CashCow Avatar answered Oct 18 '22 00:10

CashCow