Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Are comma separated statements considered full statements? (and other diagnostic issues)

I guess the answer is "no", but from a compiler point of view, I don't understand why.

I made a very simple code which freaks out compiler diagnostics quite badly (both clang and gcc), but I would like to have confirmation that the code is not ill formatted before I report mis-diagnostics. I should point out that these are not compiler bugs, the output is correct in all cases, but I have doubts about the warnings.

Consider the following code:

#include <iostream>

int main(){
  int b,a;
  b = 3;
  b == 3 ? a = 1 : b = 2;
  b == 2 ? a = 2 : b = 1;
  a = a;
  std::cerr << a << std::endl;
}

The assignment of a is a tautology, meaning that a will be initialized after the two ternary statements, regardless of b. GCC is perfectly happy with this code. Clang is slighly more clever and spot something silly (warning: explicitly assigning a variable of type 'int' to itself [-Wself-assign]), but no big deal.

Now the same thing (semantically at least), but shorter syntax:

#include <iostream>

int main(){
  int b,a = (b=3, 
             b == 3 ? a = 1 : b = 2, 
             b == 2 ? a = 2 : b = 1, 
             a);
  std::cerr << a << std::endl;
}

Now the compilers give me completely different warnings. Clang doesn't report anything strange anymore (which is probably correct because of the parenthesis precedence). gcc is a bit more scary and says:

test.cpp: In function ‘int main()’:
test.cpp:7:15: warning: operation on ‘a’ may be undefined [-Wsequence-point]

But is that true? That sequence-point warning gives me a hint that coma separated statements are not handled in the same way in practice, but I don't know if they should or not.

And it gets weirder, changing the code to:

#include <iostream>

int main(){
  int b,a = (b=3, 
             b == 3 ? a = 1 : b = 2, 
             b == 2 ? a = 2 : b = 1, 
             a+0); // <- i just changed this line
  std::cerr << a << std::endl;
}

and then suddenly clang realized that there might be something fishy with a:

test.cpp:7:14: warning: variable 'a' is uninitialized when used within its own initialization [-Wuninitialized]
             a+0);
             ^

But there was no problem with a before... For some reasons clang cannot spot the tautology in this case. Again, it might simply be because those are not full statements anymore.

The problems are:

  • is this code valid and well defined (in all versions)?
  • how is the list of comma separated statements handled? Should it be different from the first version of the code with explicit statements?
  • is GCC right to report undefined behavior and sequence point issues? (in this case clang is missing some important diagnostics) I am aware that it says may, but still...
  • is clang right to report that a might be uninitialized in the last case? (then it should have the same diagnostic for the previous case)

Edit and comments:

  • I am getting several (rightful) comments that this code is anything but simple. This is true, but the point is that the compilers mis-diagnose when they encounter comma-separated statements in initializers. This is a bad thing. I made my code more complete to avoid the "have you tried this syntax..." comments. A much more realistic and human readable version of the problem could be written, which would exhibit wrong diagnostics, but I think this version shows more information and is more complete.
  • in a compiler-torture test suite, this would be considered very understandable and readable, they do much much worse :) We need code like that to test and assess compilers. This would not look pretty in production code, but that is not the point here.
like image 399
Thibaut Avatar asked May 28 '13 17:05

Thibaut


1 Answers

5 Expressions

10 In some contexts, an expression only appears for its side effects. Such an expression is called a discarded-value expression. The expression is evaluated and its value is discarded

5.18 Comma operator [expr.comma]

A pair of expressions separated by a comma is evaluated left-to-right; the left expression is a discarded-value expression (Clause 5).83 Every value computation and side effect associated with the left expression is sequenced before every value computation and side effect associated with the right expression. The type and value of the result are the type and value of the right operand; the result is of the same value category as its right operand, and is a bit-field if its right operand is a glvalue and a bit-field.

It sounds to me like there's nothing wrong with your statement.

Looking more closely at the g++ warning, may be undefined, which tells me that the parser isn't smart enough to see that a=1 is guaranteed to be evaluated.

like image 136
Steven Maitlall Avatar answered Nov 17 '22 04:11

Steven Maitlall