Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Possible ODR-violations when using a constexpr variable in the definition of an inline function (in C++14)

(Note! This question particularly covers the state of C++14, before the introduction of inline variables in C++17)

TLDR; Question

  • What constitutes odr-use of a constexpr variable used in the definition of an inline function, such that multiple definitions of the function violates [basic.def.odr]/6?

(... likely [basic.def.odr]/3; but could this silently introduce UB in a program as soon as, say, the address of such a constexpr variable is taken in the context of the inline function's definition?)

TLDR example: does a program where doMath() defined as follows:

// some_math.h
#pragma once

// Forced by some guideline abhorring literals.
constexpr int kTwo{2};
inline int doMath(int arg) { return std::max(arg, kTwo); }
                                 // std::max(const int&, const int&)

have undefined behaviour as soon as doMath() is defined in two different translation units (say by inclusion of some_math.h and subsequent use of doMath())?

Background

Consider the following example:

// constants.h
#pragma once
constexpr int kFoo{42};

// foo.h
#pragma once
#include "constants.h"
inline int foo(int arg) { return arg * kFoo; }  // #1: kFoo not odr-used

// a.cpp
#include "foo.h"
int a() { return foo(1); }  // foo odr-used

// b.cpp
#include "foo.h"
int b() { return foo(2); }  // foo odr-used

compiled for C++14, particularly before inline variables and thus before constexpr variables were implicitly inline.

The inline function foo (which has external linkage) is odr-used in both translation units (TU) associated with a.cpp and b.cpp, say TU_a and TU_b, and shall thus be defined in both of these TU's ([basic.def.odr]/4).

[basic.def.odr]/6 covers the requirements for when such multiple definitions (different TU's) may appear, and particularly /6.1 and /6.2 is relevant in this context [emphasis mine]:

There can be more than one definition of a [...] inline function with external linkage [...] in a program provided that each definition appears in a different translation unit, and provided the definitions satisfy the following requirements. Given such an entity named D defined in more than one translation unit, then

  • /6.1 each definition of D shall consist of the same sequence of tokens; and

  • /6.2 in each definition of D, corresponding names, looked up according to [basic.lookup], shall refer to an entity defined within the definition of D, or shall refer to the same entity, after overload resolution ([over.match]) and after matching of partial template specialization ([temp.over]), except that a name can refer to a non-volatile const object with internal or no linkage if the object has the same literal type in all definitions of D, and the object is initialized with a constant expression ([expr.const]), and the object is not odr-used, and the object has the same value in all definitions of D; and

  • ...

If the definitions of D do not satisfy these requirements, then the behavior is undefined.

/6.1 is fulfilled.

/6.2 if fulfilled if kFoo in foo:

  1. [OK] is const with internal linkage
  2. [OK] is initialized with a constant expressions
  3. [OK] is of same literal type over all definitions of foo
  4. [OK] has the same value in all definitions of foo
  5. [??] is not odr-used.

I interpret 5 as particularly "not odr-used in the definition of foo"; this could arguably have been clearer in the wording. However if kFoo is odr-used (at least in the definition of foo) I interpret it as opening up for odr-violations and subsequent undefined behavior, due to violation of [basic.def.odr]/6.

Afaict [basic.def.odr]/3 governs whether kFoo is odr-used or not,

A variable x whose name appears as a potentially-evaluated expression ex is odr-used by ex unless applying the lvalue-to-rvalue conversion ([conv.lval]) to x yields a constant expression ([expr.const]) that does not invoke any non-trivial functions and, if x is an object, ex is an element of the set of potential results of an expression e, where either the lvalue-to-rvalue conversion ([conv.lval]) is applied to e, or e is a discarded-value expression (Clause [expr]). [...]

but I'm having a hard time to understand whether kFoo is considered as odr-used e.g. if its address is taken within the definition of foo, or e.g. whether if its address is taken outside of the definition of foo or not affects whether [basic.def.odr]/6.2 is fulfilled or not.


Further details

Particularly, consider if foo is defined as:

// #2
inline int foo(int arg) { 
    std::cout << "&kFoo in foo() = " << &kFoo << "\n";
    return arg * kFoo; 
}

and a() and b() are defined as:

int a() { 
    std::cout << "TU_a, &kFoo = " << &kFoo << "\n";
    return foo(1); 
}

int b() { 
    std::cout << "TU_b, &kFoo = " << &kFoo << "\n";
    return foo(2); 
}

then running a program which calls a() and b() in sequence produces:

TU_a, &kFoo    = 0x401db8
&kFoo in foo() = 0x401db8  // <-- foo() in TU_a: 
                           //     &kFoo from TU_a

TU_b, &kFoo    = 0x401dbc
&kFoo in foo() = 0x401db8  // <-- foo() in TU_b: 
                           // !!! &kFoo from TU_a

namely the address of the TU-local kFoo when accessed from the different a() and b() functions, but pointing to the same kFoo address when accessed from foo().

DEMO.

Does this program (with foo and a/b defined as per this section) have undefined behaviour?

A real life example would be where these constexpr variables represent mathematical constants, and where they are used, from within the definition of an inline function, as arguments to utility math functions such as std::max(), which takes its arguments by reference.

like image 732
dfrib Avatar asked Sep 08 '21 15:09

dfrib


People also ask

Are Constexpr variables inline?

A static member variable (but not a namespace-scope variable) declared constexpr is implicitly an inline variable.

Is Constexpr function inline?

A constexpr specifier used in a function or static data member (since C++17) declaration implies inline .

What is ODR use?

In plain word, odr-used means something(variable or function) is used in a context where the definition of it must be present.

What is a constexpr variable in C++?

A constexpr specifier used in a function or static data member (since C++17) declaration implies inline. If any declaration of a function or function template has a constexpr specifier, then every declaration must contain that specifier. A constexpr variable must satisfy the following requirements:

What is the difference between constexpr and constexpr in C++ 14?

C++ 14 allows more than one statement. constexpr function should refer only to constant global variables. constexpr function can call only other constexpr function not simple function. The function should not be of a void type and some operators like prefix increment (++v) are not allowed in constexpr function.

When does a non-constexpr function produce a value at compile time?

When called with non- constexpr arguments, or when its value isn't required at compile time, it produces a value at run time like a regular function. (This dual behavior saves you from having to write constexpr and non- constexpr versions of the same function.) A constexpr function or constructor is implicitly inline.

What are the rules for a constexpr function or constructor?

A constexpr function or constructor is implicitly inline. The following rules apply to constexpr functions: A constexpr function must accept and return only literal types. A constexpr function can be recursive. It can't be virtual. A constructor can't be defined as constexpr when the enclosing class has any virtual base classes.


1 Answers

In the OP's example with std::max, an ODR violation does indeed occur, and the program is ill-formed NDR. To avoid this issue, you might consider one of the following fixes:

  • give the doMath function internal linkage, or
  • move the declaration of kTwo inside doMath

A variable that is used by an expression is considered to be odr-used unless there is a certain kind of simple proof that the reference to the variable can be replaced by the compile-time constant value of the variable without changing the result of the expression. If such a simple proof exists, then the standard requires the compiler perform such a replacement; consequently the variable is not odr-used (in particular, it does not require a definition, and the issue described by the OP would be avoided because none of the translation units in which doMath is defined would actually reference a definition of kTwo). If the expression is too complicated, however, then all bets are off. The compiler might still replace the variable with its value, in which case the program may work as you expect; or the program may exhibit bugs or crash. That's the reality with IFNDR programs.

The case where the variable is immediately passed by reference to a function, with the reference binding directly, is one common case where the variable is used in a way that is too complicated and the compiler is not required to determine whether or not it may be replaced by its compile-time constant value. This is because doing so would necessarily require inspecting the definition of the function (such as std::max<int> in this example).

You can "help" the compiler by writing int(kTwo) and using that as the argument to std::max as opposed to kTwo itself; this prevents an odr-use since the lvalue-to-rvalue conversion is now immediately applied prior to calling the function. I don't think this is a great solution (I recommend one of the two solutions that I previously mentioned) but it has its uses (GoogleTest uses this in order to avoid introducing odr-uses in statements like EXPECT_EQ(2, kTwo)).

If you want to know more about how to understand the precise definition of odr-use, involving "potential results of an expression e...", that would be best addressed with a separate question.

like image 148
Brian Bi Avatar answered Nov 10 '22 12:11

Brian Bi