Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it safe to use the address of a static local variable within a function template as a type identifier?

I wish to create an alternative to std::type_index that does not require RTTI:

template <typename T> int* type_id() {     static int x;     return &x; } 

Note that the address of the local variable x is used as the type ID, not the value of x itself. Also, I don't intend to use a bare pointer in reality. I've just stripped out everything not relevant to my question. See my actual type_index implementation here.

Is this approach sound, and if so, why? If not, why not? I feel like I am on shaky ground here, so I am interested in the precise reasons why my approach will or will not work.

A typical use case might be to register routines at run-time to handle objects of different types through a single interface:

class processor { public:     template <typename T, typename Handler>     void register_handler(Handler handler) {         handlers[type_id<T>()] = [handler](void const* v) {             handler(*static_cast<T const*>(v));         };     }      template <typename T>     void process(T const& t) {         auto it = handlers.find(type_id<T>());         if (it != handlers.end()) {             it->second(&t);         } else {             throw std::runtime_error("handler not registered");         }     }  private:     std::map<int*, std::function<void (void const*)>> handlers; }; 

This class might be used like so:

processor p;  p.register_handler<int>([](int const& i) {     std::cout << "int: " << i << "\n"; }); p.register_handler<float>([](float const& f) {     std::cout << "float: " << f << "\n"; });  try {     p.process(42);     p.process(3.14f);     p.process(true); } catch (std::runtime_error& ex) {     std::cout << "error: " << ex.what() << "\n"; } 

Conclusion

Thanks to everyone for your help. I have accepted the answer from @StoryTeller as he has outlined why the solution should be valid according the rules of C++. However, @SergeBallesta and a number of others in the comments have pointed out that MSVC performs optimizations which come uncomfortably close to breaking this approach. If a more robust approach is needed, then a solution using std::atomic may be preferable, as suggested by @galinette:

std::atomic_size_t type_id_counter = 0;  template <typename T> std::size_t type_id() {     static std::size_t const x = type_id_counter++;     return x; } 

If anyone has further thoughts or information, I am still eager to hear it!

like image 237
Joseph Thomson Avatar asked Jan 26 '17 06:01

Joseph Thomson


People also ask

What happens if I declare a local variable in a function with the static keyword?

Static local variables: variables declared as static inside a function are statically allocated while having the same scope as automatic local variables. Hence whatever values the function puts into its static local variables during one call will still be present when the function is called again.

Why static local variables are not allowed?

In Java, a static variable is a class variable (for whole class). So if we have static local variable (a variable with scope limited to function), it violates the purpose of static. Hence compiler does not allow static local variable.

How would a static variable local variable be useful?

Static local variables are useful when we want to have only one instance of our object in the local scope, which means all calls to the function will share the same object. The same can also be achieved by using global variables or static member variables.

Can we use static variable in main function?

We cannot declare static variables in the main method or any kind of method of the class. static variables must be declared like a class member in the class. Because during compilation time JVM binds static variables to the class level that means they have to declare like we declare class members in the class.


2 Answers

Yes, it will be correct to an extent. Template functions are implicitly inline, and static objects in inline functions are shared across all translation units.

So, in every translation unit, you will get the address of the same static local variable for the call to type_id<Type>(). You are protected here from ODR violations by the standard.

Therefore, the address of the local static can be used as a sort of home-brewed run-time type identifier.

like image 159
StoryTeller - Unslander Monica Avatar answered Oct 09 '22 03:10

StoryTeller - Unslander Monica


This is coherent with standard because C++ use templates and not generics with type erasure like Java so each declared type will have its own function implementation containing a static variable. All those variables are different and as such should have different addresses.

The problem is that their value is never used and worse never changed. I remember that the optimizers can merge string constants. As optimizers do their best to be far more clever than any human programmer, I will be afraid that a too zealous optimizing compiler discover that as those variable values are never changed, they will all keep a 0 value, so why not merge them all to save memory?

I know that because of the as if rule, the compiler is free to do what it wants provided the observable results are the same. And I am not sure that the addresses of static variables that will always share the same value shall be different or not. Maybe someone could confirm what part of the standard actually cares for it?

Current compilers still compile separately program units, so they cannot be sure whether another program unit will use or change the value. So my opinion is that the optimizer will not have enough information to decide to merge the variable, and your pattern is safe.

But as I really do not think that standard protects it, I cannot say whether future versions of C++ builders (compiler + linker) will not invent a global optimizing phase actively searching for unchanged variables that could be merged. More or less the same as they actively search UB to optimize out parts of code... Only common patterns, where not allowing them would break a too large code base are protected of it, and I do not think that yours is common enough.

A rather hacky way to prevent an optimizing phase to merge variables having same value would just be to give each one a different value:

int unique_val() {     static int cur = 0;  // normally useless but more readable     return cur++; } template <typename T> void * type_id() {     static int x = unique_val();     return &x; } 

Ok, this does not even try to be thread safe, but it not a problem here: the values will never be used per themselves. But you now have different variables having static duration (per 14.8.2 of standard as said by @StoryTeller), that except in race conditions have different values. As they are odr used they must have different addresses and you should be protected for future improvement of optimizing compilers...

Note: I think that as the value will not be used, returning a void * sounds cleaner...


Just an addition stolen from a comment from @bogdan. MSVC is known to have very aggressive optimization with the /OPT:ICF flag. The discussion suggest that is should not be conformant, and that it only applies to variable marked as const. But it enforces my opinion that even if OP's code seems conformant, I would not dare to use it without additional precautions in production code.

like image 40
Serge Ballesta Avatar answered Oct 09 '22 03:10

Serge Ballesta