Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Strings in headers -- does this violate the ODR?

Consider the following program with two compilation units.


// a.hpp

class A {
  static const char * get() { return "foo"; }
};

void f();

// a.cpp

#include "a.hpp"
#include <iostream>

void f() {
  std::cout << A::get() << std::endl;
}

// main.cpp

#include "a.hpp"
#include <iostream>

void g() {
  std::cout << A::get() << std::endl;
}

int main() {
  f();
  g();
}

It is quite common to need to create global string constants for some reason or other. Doing this in the totally naive way causes linker problems. Usually, people put a declaration in the header and a definition in a single compilation unit, or use macros.

I had been under the impression that this way of doing it (shown above) with a function was "okay", because it is an inline function and the linker eliminates any duplicate copies that are produced, and programs written using this pattern seem to work fine. However, now I have my doubts about whether it's actually legitimate.

The function A::get is odr-used in two different translation units, but it is implicitly inline since it is a class member.

In [basic.def.odr.6] it states:

There can be more than one definition of a ... inline function with external linkage (7.1.2)... in a program provided that each definition appears in a different translation unit, and provided the definitions satisfy the following requirements. Given such an entity named D defined in more than one translation unit, then
- each definition of D shall consist of the same sequence of tokens; and
- in each definition of D, corresponding names, looked up according to 3.4, shall refer to an entity defined within the definition of D, or shall refer to the same entity, after overload resolution (13.3) and after matching of partial template specialization (14.8.3), except that a name can refer to a non-volatile const object with internal or no linkage if the object has the same literal type in all definitions of D, and the object is initialized with a constant expression (5.19), and the object is not odr-used, and the object has the same value in all definitions of D; and
- in each definition of D, corresponding entities shall have the same language linkage; and
- ... (more conditions that don't seem relevant)

If the definitions of D satisfy all these requirements, then the program shall behave as if there were a single definition of D. If the definitions of D do not satisfy these requirements, then the behavior is undefined.

In my example program, the two definitions (one in each translation unit) each correspond to the same sequence of tokens. (This is why I originally thought it was okay.)

However, it's not clear that the second condition is satisfied. Because, the name "foo" might not correspond to the same object in the two compilation units -- it's potentially a "different" string literal in each, no?

I tried changing the program:

  static const void * get() { return static_cast<const void*>("foo"); }

so that it prints the address of the string literal, and I get the same address, however I'm not sure if that's guaranteed to happen.

Does it fall under "... shall refer to an entity defined within the definition of D"? Is "foo" considered to be defined within A::get here? It might seem so, but as I understand informally, string literals ultimately cause the compiler to emit some sort of global const char[] which lives in a special segment of the executable. Is that "entity" considered to be within A::get or is that not relevant?

Is "foo" even considered a "name", or does the term "name" refer only a valid C++ "identifier", like could be used for a variable or function ? On the one hand it says:

[basic][3.4]
A name is a use of an identifier (2.11), operator-function-id (13.5), literal-operator-id (13.5.8), conversion- function-id (12.3.2), or template-id (14.2) that denotes an entity or label (6.6.4, 6.1).

and an identifier is

[lex.name][2.11]
An identifier is an arbitrarily long sequence of letters and digits.

so it seems like a string literal is not a name.

On the other hand in section 5

[expr.prim.general][5.1.1.1]
A string literal is an lvalue; all other literals are prvalues.

Generally, I thought that lvalues have names.

like image 666
Chris Beck Avatar asked Jun 18 '16 21:06

Chris Beck


1 Answers

Your last argument is nonsense. "foo" isn't even grammatically a name, but a string-literal. And string literals being lvalues and some lvalues having names does not imply that string literals are or have names. String literals as used in your code do not violate the ODR.

It was actually, until C++11, mandated that string literals in multiple definitions of inline functions across TUs designate the same entity, but that superfluous and mostly unimplemented rule was removed by CWG 1823.

Because, the name "foo" might not correspond to the same object in the two compilation units -- it's potentially a "different" string literal in each, no?

Correct, but that's irrelevant. Because the ODR does not care about specific argument values. If you did manage to somehow get a different e.g. function template specialization to be called in both TUs, that would be problematic, but fortunately string literals are invalid template arguments, so you're gonna have to be clever.

like image 132
Columbo Avatar answered Sep 22 '22 16:09

Columbo