Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C++11: ill-formed calls are undefined behavior?

§ 14.6.4.2 from N3485 states the following about dependent candidate function lookup:

If the call would be ill-formed or would find a better match had the lookup within the associated namespaces considered all the function declarations with external linkage introduced in those namespaces in all translation units, not just considering those declarations found in the template definition and template instantiation contexts, then the program has undefined behavior.

What exactly does it mean for a call to be "ill-formed", and how would an ill-formed call be selected by the lookup? Also, why does it matter that a better match would be found if all translation units were considered?

like image 706
Josh Gao Avatar asked Mar 06 '13 21:03

Josh Gao


People also ask

What is undefined behavior in C++?

So, in C/C++ programming, undefined behavior means when the program fails to compile, or it may execute incorrectly, either crashes or generates incorrect results, or when it may fortuitously do exactly what the programmer intended.

What does ill formed mean C++?

An ill-formed program is a C++ program that is not well-formed; that is, a program not constructed according to the syntax rules, diagnosable semantic rules, and the one-definition rule.


4 Answers

Building on @tletnes' partial answer, I think I've come up with a simple program that triggers this particular undefined behavior. Of course it uses multiple translation units.

cat >alpha.cc <<EOF
#include <stdio.h>
void customization_point(int,int) { puts("(int,int)"); }
#include "beta.h"
extern void gamma();
int main() {
    beta(42);
    gamma();
}
EOF

cat >gamma.cc <<EOF
#include <stdio.h>
void customization_point(int,double) { puts("(int,double)"); }
#include "beta.h"
void gamma() { beta(42); }
EOF

cat >beta.h <<EOF
template<typename T>
void beta(T t) {
    customization_point(t, 3.14);
}
EOF

Compiling this program with different optimization levels changes its behavior. This is all right, according to the Standard, because the call in "alpha.cc" invokes undefined behavior.

$ clang++ alpha.cc gamma.cc -O1 -w ; ./a.out
(int,int)
(int,int)
$ clang++ alpha.cc gamma.cc -O2 -w ; ./a.out
(int,int)
(int,double)
like image 105
Quuxplusone Avatar answered Oct 02 '22 02:10

Quuxplusone


What exactly does it mean for a call to be "ill-formed"

Formally, ill-formed is defined by [defns.ill.formed] as not well-formed, and a well-formed program is defined by [defns.well.formed] as:

C++ program constructed according to the syntax rules, diagnosable semantic rules, and the One Definition Rule (3.2).

So an ill-formed call is one with invalid syntax or a diagnosable error such as passing the wrong number of arguments, or arguments which cannot be converted to the parameter types, or an overload ambiguity.

how would an ill-formed call be selected by the lookup?

I think it's saying "if (the call would be ill-formed || would find a better match) had the lookup within the associated namespaces considered all the function declarations with external linkage ...", which means you have undefined behaviour if considering other functions would have found equal or better matches. Equally good matches would make the call ambiguous, i.e. ill-formed, and better matches would have resulted in a different function being called.

So if in another context the call would have been ambiguous or caused another sort of error, but succeeds due to only considering a limited set of names in the instantiation and definition contexts, it's undefined. And if in another context the call would have chosen a better match, that's also undefined.

Also, why does it matter that a better match would be found if all translation units were considered?

I think the reason for the rule is to disallow situations where instantiating the same template specialization in two different contexts results in it calling two different functions, e.g. if in one translation unit the call finds one function, and in another translation unit it finds a different function, you'll get two different instantiations of the same template, which violates the ODR, and only one instantiation will be kept by the linker, so the instantiation that's not kept by the linker will get replaced by one which calls a function that wasn't even visible where the template was instantiated.

That's similar (if not already covered by) the last sentence of the previous paragraph:

A specialization for any template may have points of instantiation in multiple translation units. If two different points of instantiation give a template specialization different meanings according to the one definition rule (3.2), the program is ill-formed, no diagnostic required.

Page 426 of the C++ ARM (Ellis & Stroustrup) gives a bit of context for that text (and I believe for 14.6.4.2 as well) and explains it more concisely and clearly than I did above:

This would seem to imply that a global name used from within a template could be bound to different objects or functions in different compilation units or even at different points within a compilation unit. However, should that happen, the resulting template function or class is rendered illegal by the "one-definition" rule (§7.1.2).

There's another related formulation of the same rules in [basic.def.odr]/6

like image 34
Jonathan Wakely Avatar answered Oct 01 '22 02:10

Jonathan Wakely


The problem is that namespaces can be defined piecemeal, so there is no one place that is guaranteed to define all of the members of a namespace. As a result, different translation units can see different sets of namespace members. What this section says is that if the part that isn't seen would affect lookup, the behavior is undefined. For example:

namespace mine {
    void f(double);
}

mine::f(2); // seems okay...

namespace mine {
    void f(char);
}

mine::f(2); // ambiguous, therefore ill-formed

The rule says that the first call to f(2) produces undefined behavior because it would have been ill-formed if all of the overloads in mine had been visible at that point.

like image 26
Pete Becker Avatar answered Oct 02 '22 02:10

Pete Becker


When I read this rule I imagine the code similar to the following is at least part of what was being considered:

int foo(int a; int b){ printf("A"); }

int main(){
   foo(1, 1.0);
}

int foo(int a, double b){ printf("B"); }

or

int foo(int a);

int main(){
   foo(1);
}

int foo(int a, double b){ printf("B"); }
like image 28
tletnes Avatar answered Oct 01 '22 02:10

tletnes