Is the following singleton implementation data-race free?
static std::atomic<Tp *> m_instance; ... static Tp & instance() { if (!m_instance.load(std::memory_order_relaxed)) { std::lock_guard<std::mutex> lock(m_mutex); if (!m_instance.load(std::memory_order_acquire)) { Tp * i = new Tp; m_instance.store(i, std::memory_order_release); } } return * m_instance.load(std::memory_order_relaxed); }
Is the std::memory_model_acquire
of the load operation superfluous? Is it possible to further relax both load and store operations by switching them to std::memory_order_relaxed
? In that case, is the acquire/release semantic of std::mutex
enough to guarantee its correctness, or a further std::atomic_thread_fence(std::memory_order_release)
is also required to ensure that the writes to memory of the constructor happen before the relaxed store? Yet, is the use of fence equivalent to have the store with memory_order_release
?
EDIT: Thanks to the answer of John, I came up with the following implementation that should be data-race free. Even though the inner load could be non-atomic at all, I decided to leave a relaxed load in that it does not affect the performance. In comparison to always have an outer load with the acquire memory order, the thread_local machinery improves the performance of accessing the instance of about an order of magnitude.
static Tp & instance() { static thread_local Tp *instance; if (!instance && !(instance = m_instance.load(std::memory_order_acquire))) { std::lock_guard<std::mutex> lock(m_mutex); if (!(instance = m_instance.load(std::memory_order_relaxed))) { instance = new Tp; m_instance.store(instance, std::memory_order_release); } } return *instance; }
Double checked locking of Singleton is a way to make sure that only one instance of Singleton class is created through an application life cycle.
This double check lock is only necessary if you are worried about many threads calling the singleton simultaneously, or the cost of obtaining a lock in general. Its purpose is to prevent unnecessary synchronization, thereby keeping your code fast in a multi-threaded environment.
Double-Checked Locking is widely cited and used as an efficient method for implementing lazy initialization in a multithreaded environment. Unfortunately, it will not work reliably in a platform independent way when implemented in Java, without additional synchronization.
Double-checked locking is a common pattern for lazy initialization of a field accessed by multiple threads.
I think this a great question and John Calsbeek has the correct answer.
However, just to be clear a lazy singleton is best implemented using the classic Meyers singleton. It has garanteed correct semantics in C++11.
§ 6.7.4
... If control enters the declaration concurrently while the variable is being initialized, the concurrent execution shall wait for completion of the initialization. ...
The Meyer's singleton is preferred in that the compiler can aggressively optimize the concurrent code. The compiler would be more restricted if it had to preserve the semantics of a std::mutex
. Furthermore, the Meyer's singleton is 2 lines and virtually impossible to get wrong.
Here is a classic example of a Meyer's singleton. Simple, elegant, and broken in c++03. But simple, elegant, and powerful in c++11.
class Foo { public: static Foo& instance( void ) { static Foo s_instance; return s_instance; } };
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With