__thread Foo foo;
How is "foo" actually resolved? Does the compiler silently replace every instance of "foo" with a function call? Is "foo" stored somewhere relative to the bottom of the stack, and the compiler stores this as "hey, for each thread, have this space near the bottom of the stack, and foo is stored as 'offset x from bottom of stack'"?
With thread local storage (TLS), you can provide unique data for each thread that the process can access using a global index. One thread allocates the index, which can be used by the other threads to retrieve the unique data associated with the index.
The WSD block is created and initialised before the executable's code begins to run. C++0x introduces the thread_local keyword which ensures that a separate instance of the global variable is created per thread. The problem is that a different block needs to be loaded per thread.
TLS is always going to be slow relative to simple access. Accessing TLS globals in a tight loop is going to be slow, too.
We need thread-local storage to create libraries that have thread-safe functions, because of the thread-local storage each call to a function has its copy of the same global data, so it's safe I like to point out that the implementation is the same for copy on write technique.
It's a little complicated (this document explains it in great detail), but it's basically neither. Instead the compiler puts a special .tdata section in the executable, which contains all the thread-local variables. At runtime, a new data section for each thread is created with a copy of the data in the (read-only) .tdata section, and when threads are switched at runtime, the section is also switched automatically.
The end result is that __thread variables are just as fast as regular variables, and they don't take up extra stack space, either.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With