Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Global constants in C++11

What are the best ways to declare and define global constants in C++? I am mostly interested in C++11 standard as it fixes a lot in this regard.

[EDIT (clarification)]: in this question "global constant" denotes constant variable or function that is known at compile time in any scope. Global constant must be accessible from more than one translation unit. It is not necessarily constexpr-style constant - can be something like const std::map<int, std::string> m = { { 1, "U" }, { 5, "V" } }; or const std::map<int, std::string> * mAddr() { return & m; }. I do not touch preferable good-style scope or name for constant in this question. Let us leave these matters for another question. [END_EDIT]

I want to know answers for all the different cases, so let us assume that T is one of the following:

typedef    int                     T;  // 1
typedef    long double             T;  // 2
typedef    std::array<char, 1>     T;  // 3
typedef    std::array<long, 1000>  T;  // 4
typedef    std::string             T;  // 5
typedef    QString                 T;  // 6
class      T {
   // unspecified amount of code
};                                     // 7
// Something special
// not mentioned above?                // 8

I believe that there is no big semantic (I do not discuss good naming or scope style here) difference between the 3 possible scopes:

// header.hpp
extern const T tv;
T tf();                  // Global
namespace Nm {
    extern const T tv;
    T tf();              // Namespace
}
struct Cl {
    static const T tv;
    static T tf();       // Class
};

But if choosing better way from alternatives below depends on the difference between above declaration scopes, please point it out.

Consider also the case when function call is used in constant definition, e.g. <some value>==f();. How would calling a function in constant initialization influence choosing between alternatives?

  1. Let us consider T with constexpr constructor first. Obvious alternatives are:

    // header.hpp
    namespace Ns {
    constexpr T A = <some value>;
    constexpr T B() { return <some value>; }
    inline const T & C() { static constexpr T t = <some value>; return t; }
    const T & D();
    }
    
    // source.cpp
    const T & Ns::D() { static constexpr T t = <some value>; return t; }
    

    I believe that A and B are most suitable for small T (such that having multiple instances or copying it at runtime is not a problem), e.g. 1-3, sometimes 7. C and D are better if T is large, e.g. 4, sometimes 7.

  2. T without constexpr constructor. Alternatives:

    // header.hpp
    namespace Ns {
    extern const T a;
    inline T b() { return <some value>; }
    inline const T & c() { static const T t = <some value>; return t; }
    const T & d();
    }
    
    // source.cpp
    extern const T Ns::a = <some value>;
    const T & Ns::d() { static const T t = <some value>; return t; }
    

    I would not normally use a because of static initialization order fiasco. As far as I know, b, c and d are perfectly safe, even thread-safe since C++11. b does not seem to be a good choice unless T has a very cheap constructor, which is uncommon for non-constexpr constructors. I can name one advantage of c over d - no function call (run-time performance); one advantage of d over c - less recompiling when constant's value is changed (these advantages also apply to C and D). I am sure that I missed a lot of reasoning here. Provide other considerations in answers please.

If you want to modify / test the above code, you can use my test files (just header.hpp, source.cpp with compilable versions of above code fragments and main.cpp that prints constants from header.hpp): https://docs.google.com/uc?export=download&id=0B0F-aqLyFk_PVUtSRnZWWnd4Tjg

like image 636
vedg Avatar asked May 14 '14 12:05

vedg


People also ask

What is a global constant?

Global Constants. A global constant is a literal value to which you assign a name. Like a global variable, you can access the value of the global constant from any script or 4GL procedure in the application. You set the value for the global constant when you declare it.

What is a global constant in C++?

Global Constants in C++ C++ global constants have static linkage. This is different than C. If you try to use a global constant in C++ in multiple files you get an unresolved external error. The compiler optimizes global constants out, leaving no space reserved for the variable.

Where are global constants declared?

Constants can be declared in the global scope using the static keyword, the type annotation is obligatory in this case. These constants are placed in a read-only section of the memory and can be accessed in any other part of the program. String literals like "string" can also be assigned to static variables.


3 Answers

I believe that there is no big difference between the following declaration locations:

This is wrong in a lot of ways.

The first declaration pollutes the global namespace; you have taken the name "tv" from ever being used again without the possibility of misunderstandings. This can cause shadowing warnings, it can cause linker errors, it can cause all sorts of confusion to anyone who uses your header. It can also cause problems to someone who doesn't use your header, by causing a collision with someone else who also happens to use your variable name as a global.

Such an approach is not recommended in modern C++, but is ubiquitous in C, and therefore leads to much use of the static keyword for "global" variables in a .c file (file scope).

The second declares pollutes a namespace; this is much less of an issue, as namespaces are freely renamable and can be made at no cost. As long as two projects use their own, relatively specific namespace, no collisions will occur. In the case where such collisions do occur, the namespaces for each can be renamed to avoid any issues.

This is more modern, C++03 style, and C++11 expands this tactic considerably with renaming of templates.

The third approach is a struct, not a class; they have differences, especially if you want to maintain compatibility with C. The benefits of a class scope compound on the namespace scope; not only can you easily encapsulate multiple things and use a specific name, you can also increase encapsulation via methods and information hiding, greatly expanding how useful your code is. This is mostly the benefit of classes, irrespective of scoping benefits.

You should almost certainly not use the first one, unless your functions and variables are very broad and STL/STD like, or your program is very small and not likely to be embedded or reused.

Let's now look at your cases.

  1. The size of the constructor, if it returns a constant expression, is unimportant; all of the code ought to be executable at compile time. This means the complexity is not meaningful; it will always compile to a single, constant, return value. You should almost certainly never use C or D; all that does is make the optimizations of constexpr not work. I would use whichever of A and B looks more elegant, probably a simple assignment would be A, and a complex constant expression would be B.

  2. None of these are necessarily thread safe; the content of the constructor would determine both thread and exception safety, and it is quite easy to make any of these statements not thread safe. In fact, A is most likely to be thread safe; as long as the object is not accessed until main is called, it should be fully formed; the same cannot be said of any of your other examples. As for your analysis of B, in my experience, most constructors (especially exception safe ones) are cheap as they avoid allocation. In such cases, there's unlikely to be much difference between any of your cases.

I would highly recommend you stop attempting micro-optimizations like this and perhaps get a more solid understanding of C++ idioms. Most of the things you are trying to do here are unlikely to result in any increase in performance.

like image 117
Alice Avatar answered Oct 20 '22 16:10

Alice


You didn't mention an important option:

namespace
{
    const T t = .....;
};

Now there are no name collision issues.

This isn't appropriate if T is something you only want to construct once. But having a large "global" object, const or not, is something you really want to avoid. It breaks encapsulation, and also introduces the static initialization order fiasco into your code.

I've never had the need for a large extern const object. If I need a large hardcoded lookup table for example, then I write a function (perhaps as a class member) that looks up the table; and the table is local to the unit with the implementation of that function.

In my code that seems to call for a large non-const global object, I actually have a function,

namespace MyStuff
{
     T &get_global_T();
}

which constructs the object on first use. (Actually, the object itself is hidden in one unit, and T is a helper class that specifies an interface; so I can mess around with the object's details and not disturb any code that is using it).

like image 24
M.M Avatar answered Oct 20 '22 16:10

M.M


1

In case A there is a difference between global or namespace scope (internal linkage) and class scope (external linkage). So

// header.hpp
constexpr T A = <some value>; // internal linkage
namespace Nm { constexpr T A = <some value>; } // internal linkage
class Cl { public: static constexpr T A = <some value>; }; // not enough!

Consider the following usage:

// user.cpp
std::cout << A << Nm::A << Cl::A; // ok
std::cout << &A << &Nm::A;        // ok
std::cout << &Cl::A;              // linker error: undefined reference to `Cl::A'

Placing Cl::A definition in source.cpp (in addition to the above Cl::A declaration) eliminates this error:

// source.cpp
constexpr T Cl::A;

External linkage means that there would always be only one instance of Cl::A. So Cl::A seems to be a very good candidate for large T. However: can we be sure that static initialization order fiasco would not present itself in this case? I believe that the answer is yes, because Cl::A is constructed at compile-time.

I have tested A, B, a alternatives with g++ 4.8.2 and 4.9.0, clang++ 3.4 on GNU/Linux platform. The results for three translation units:

  • A in class scope with definition in source.cpp was both immune to fiasco and had the same address in all translation units even at compile-time.
  • A in namespace or global scope had 3 different addresses both for large array and constexpr const char * A = "A"; (because of internal linkage).
  • B (std::array<long double, 100>) in any scope had 2 different addresses (address was the same in 2 translation units); additionally all 3 B addresses suggested some different memory location (they were much bigger than other addresses) - I suspect that array was copied in memory at runtime.
  • a when used with constexpr types T, e.g. int, const char *, std::array, AND initialized with constexpr expression in source.cpp, was as good as A: immune to fiasco and had the same address in all translation units. If constant of constexpr type T is initialized with non-constexpr, e.g. std::time(nullptr), and used before initialization, it would contain default value (for example, 0 for int). It means that constant's value can depend on static initialization order in this case. So, do not initialize a with non-constexpr value!

The bottom line

  1. prefer A in class scope for any constexpr constant in most cases because it combines perfect safety, simplicity, memory saving and performance.
  2. a (initialized with constexpr value in source.cpp!) should be used if namespace scope is preferable or it is desirable to avoid initialization in header.hpp (in order to reduce dependencies and compilation time). a has one disadvantage compared to A: it can be used in compile-time expressions only in source.cpp and only after initialization.
  3. B should be used for small T in some cases: when namespace scope is preferable or template compile-time constant is needed (pi for example). Also B can be used when constant's value is rarely used or used only in exceptional situations, e.g. error messages.
  4. Other alternatives should almost never be used as they would rarely suit better than all 3 before-mentioned ways.
    • A in namespace scope should not be used because it can potentially lead to N instances of constant, hence consume sizeof(T) * N bytes of memory and cause cache misses. Here N equals to the number of translation units that include header.hpp. As noted in this proposal, A in namespace scope can violate ODR if used in inline function.
    • C could be used for big T (B is usually better for small T) in 2 rare scenarios: when function call is preferable; when namespace scope AND initializing in header is preferable.
    • D could be used when function call AND initializing in source file is preferable.
    • The only shortcoming of C compared to A and B - its return value can not be used in compile-time expression. D suffers from the same shortcoming and another one: function call run-time performance penalty (because it can not be inlined).

2

Avoid using non-constexpr a because of static initialization order fiasco. Consider a only in case of sure bottleneck. Otherwise, safety is more important than small performance gain. b, c and d are much safer. However c and d have 2 safety requirements:

for (auto f : {all c and d-like functions}) {

  • T constructor must not call f because if the initialization of static local variable recursively enters the block in which the variable is being initialized, the behavior is undefined. This is not difficult to ensure.
  • For each class X such that X::~X calls f and there is a statically initialized X object: X::X must call f. The reason is that otherwise static const T from f could be constructed after and therefore destructed before global X object; then X::~X would cause UB. This requirement is much more difficult to guarantee than the previous one. So it almost prohibits global or static local variables with complicated destructors that use global constants. If destructor of statically initialized variable is not complicated, e.g. uses f() for logging purposes, then placing f(); in the corresponding constructor ensures safety.

}

Note: these 2 requirements do not apply to C and D:

  • the recursive call to f would not compile;
  • static constexpr T constants in C and D are constructed at compile time - before any non-trivial variable is constructed, so they are destructed after all non-trivial variables' destruction (destructors are called in reverse order).

Note 2: C++ FAQ suggests a different implementation of c and d, which does not impose the second safety requirement. However in this case static constant is never destructed, which can interfere with memory leak detection, e.g. Valgrind diagnostic. Memory leaks, however benign, should be avoided. So these modified versions of c and d should be used only in exceptional situations.

One more alternative to consider here is a constant with internal linkage:

// header.hpp
namespace Ns { namespace { const T a1 = <some value>; } }

This approach has the same big downside as A in namespace scope: internal linkage can create as many copies of a1 as the number of translation units that include header.hpp. It can also violate ODR in the same way as A. However, since other options for non-constexpr are not as good as for constexpr constants, this alternative actually could have some rare use. BUT: this "solution" is still prone to static initialization order fiasco in case when a1 is used in public function which in turn is used for initialization of a global object. So introducing internal linkage does not solve the problem - just hides it, makes it less likely, probably more difficult to locate and fix.

The bottom line

  • c provides the best performance and saves memory because it facilitates reusing exactly one T instance and can be inlined, so it should be used in most cases.
  • d is as good as c for saving memory but is worse for performance as it would never be inlined. However d can be used to reduce compilation time.
  • consider b for small types or for rarely used constants (in rarely-used-constant case its definition can be moved to source.cpp to avoid recompilation on change). Also b is the only solution if safety requirements for c and d can not be satisfied. b is definitely not good for large T if constant is used often, because the constant has to be constructed each time b is called.

Note: there is another compile-time issue of inline functions and variables initialized in header.hpp. If constant's definition depends on another constant declared in a different header bad.h, and header bad.h should not be included in header.hpp, then D, d, a and modified b (with definition moved to source.cpp) are the only alternatives.

like image 23
vedg Avatar answered Oct 20 '22 17:10

vedg