While reading the source codes of Go, I have a question about the code in src/sync/once.go:
func (o *Once) Do(f func()) {
// Note: Here is an incorrect implementation of Do:
//
// if atomic.CompareAndSwapUint32(&o.done, 0, 1) {
// f()
// }
//
// Do guarantees that when it returns, f has finished.
// This implementation would not implement that guarantee:
// given two simultaneous calls, the winner of the cas would
// call f, and the second would return immediately, without
// waiting for the first's call to f to complete.
// This is why the slow path falls back to a mutex, and why
// the atomic.StoreUint32 must be delayed until after f returns.
if atomic.LoadUint32(&o.done) == 0 {
// Outlined slow-path to allow inlining of the fast-path.
o.doSlow(f)
}
}
func (o *Once) doSlow(f func()) {
o.m.Lock()
defer o.m.Unlock()
if o.done == 0 {
defer atomic.StoreUint32(&o.done, 1)
f()
}
}
Why is atomic.StoreUint32
used, rather than, say o.done = 1
? Are these not equivalent? What are the differences?
Must we use the atomic operation (atomic.StoreUint32
) to make sure that other goroutines can observe the effect of f()
before o.done
is set to 1 on a machine with weak memory model?
Remember, unless you are writing the assembly by hand, you are not programming to your machine's memory model, you are programming to Go's memory model. This means that even if primitive assignments are atomic with your architecture, Go requires the use of the atomic package to ensure correctness across alls supported architectures.
Access to the done
flag outside of the mutex only needs to be safe, not strictly ordered, so atomic operations can be used instead of always obtaining a lock with a mutex. This is an optimization to make the fast path as efficient as possible, allowing sync.Once
to be used in hot paths.
The mutex used for doSlow
is for mutual exclusion within that function alone, to ensure that only one caller ever makes it to f()
before the done
flag is set. The flag is written using atomic.StoreUint32
, because it may happen concurrently with atomic.LoadUint32
outside of the critical section protected by the mutex.
Reading the done
field concurrently with writes, even atomic writes, is a data race. Just because the field is read atomically, does not mean you can use normal assignment to write it, hence the flag is checked first with atomic.LoadUint32
and written with atomic.StoreUint32
The direct read of done
within doSlow
is safe, because it is protected from concurrent writes by the mutex. Reading the value concurrently with atomic.LoadUint32
is safe because both are read operations.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With