The "Go maps in action" entry in the Go blog states: <blockquote> Maps are not safe for concurrent use: it's not defined what happens when you read and write to them simultaneously. If you need to read from and write to a map from concurrently executing goroutines, the accesses must be mediated by some kind of synchronization mechanism. One common way to protect maps is with sync.RWMutex. </blockquote> However, one common way to access maps is to iterate over them with the <code>range</code> keyword. It is not clear if for the purposes of concurrent access, execution inside a <code>range</code> loop is a "read", or just the "turnover" phase of that loop. For example, the following code may or may not run afoul of the "no concurrent r/w on maps" rule, depending on the specific semantics / implementation of the <code>range</code> operation: <pre class="prettyprint"><code> var testMap map[int]int testMapLock := make(chan bool, 1) testMapLock <- true testMapSequence := 0 </code></pre> ... <pre class="prettyprint"><code> func WriteTestMap(k, v int) { <-testMapLock testMap[k] = v testMapSequence++ testMapLock<-true } func IterateMapKeys(iteratorChannel chan int) error { <-testMapLock defer func() { testMapLock <- true } mySeq := testMapSequence for k, _ := range testMap { testMapLock <- true iteratorChannel <- k <-testMapLock if mySeq != testMapSequence { close(iteratorChannel) return errors.New("concurrent modification") } } return nil } </code></pre> The idea here is that the <code>range</code> "iterator" is open when the second function is waiting for a consumer to take the next value, and the writer is not blocked at that time. However, it is never the case that two reads in a single iterator are on either side of a write - this is a "fail fast" iterator, the borrow a Java term. Is there anything anywhere in the language specification or other documents that indicates if this is a legitimate thing to do, however? I could see it going either way, and the above quoted document is not clear on exactly what consititutes a "read". The documentation seems totally quiet on the concurrency aspects of the <code>for</code>/<code>range</code> statement. (Please note this question is about the currency of <code>for/range</code>, but not a duplicate of: Golang concurrent map access with range - the use case is completely different and I am asking about the precise locking requirement wrt the 'range' keyword here!)

The unit of concurrent access for a <code>for</code> <code>range</code> loop over a <code>map</code> is the map. Go maps in action. A map is a dynamic data structure that changes for inserts, updates and deletes. Inside the Map Implementation. For example, <blockquote> The iteration order over maps is not specified and is not guaranteed to be the same from one iteration to the next. If map entries that have not yet been reached are removed during iteration, the corresponding iteration values will not be produced. If map entries are created during iteration, that entry may be produced during the iteration or may be skipped. The choice may vary for each entry created and from one iteration to the next. If the map is nil, the number of iterations is 0. <a href="https://golang.org/ref/spec#For_statements" rel="nofollow noreferrer">For statements, The Go Programming Language Specification</a> </blockquote> Reading a map with a <code>for</code> <code>range</code> loop with interleaved inserts, updates and deletes is unlikely to be useful. Lock the map: <pre class="prettyprint"><code>package main import ( "sync" ) var racer map[int]int var race sync.RWMutex func Reader() { race.RLock() // Lock map for k, v := range racer { _, _ = k, v } race.RUnlock() } func Write() { for i := 0; i < 1e6; i++ { race.Lock() racer[i/2] = i race.Unlock() } } func main() { racer = make(map[int]int) Write() go Write() Reader() } </code></pre> Don't lock after the read -- <code>fatal error: concurrent map iteration and map write</code>: <pre class="prettyprint"><code>package main import ( "sync" ) var racer map[int]int var race sync.RWMutex func Reader() { for k, v := range racer { race.RLock() // Lock after read _, _ = k, v race.RUnlock() } } func Write() { for i := 0; i < 1e6; i++ { race.Lock() racer[i/2] = i race.Unlock() } } func main() { racer = make(map[int]int) Write() go Write() Reader() } </code></pre> Use the Go Data Race Detector. Read Introducing the Go Race Detector.

Concurrent access to maps with 'range' in Go

Tags:

dictionary

foreach

for-loop

concurrency

go

The "Go maps in action" entry in the Go blog states:

Maps are not safe for concurrent use: it's not defined what happens when you read and write to them simultaneously. If you need to read from and write to a map from concurrently executing goroutines, the accesses must be mediated by some kind of synchronization mechanism. One common way to protect maps is with sync.RWMutex.

However, one common way to access maps is to iterate over them with the range keyword. It is not clear if for the purposes of concurrent access, execution inside a range loop is a "read", or just the "turnover" phase of that loop. For example, the following code may or may not run afoul of the "no concurrent r/w on maps" rule, depending on the specific semantics / implementation of the range operation:

 var testMap map[int]int
 testMapLock := make(chan bool, 1)
 testMapLock <- true
 testMapSequence := 0

...

 func WriteTestMap(k, v int) {
    <-testMapLock
    testMap[k] = v
    testMapSequence++
    testMapLock<-true
 }

 func IterateMapKeys(iteratorChannel chan int) error {
    <-testMapLock
    defer func() { 
       testMapLock <- true
    }
    mySeq := testMapSequence
    for k, _ := range testMap {
       testMapLock <- true
       iteratorChannel <- k
       <-testMapLock
       if mySeq != testMapSequence {
           close(iteratorChannel)
           return errors.New("concurrent modification")
       }
    }
    return nil
 }

The idea here is that the range "iterator" is open when the second function is waiting for a consumer to take the next value, and the writer is not blocked at that time. However, it is never the case that two reads in a single iterator are on either side of a write - this is a "fail fast" iterator, the borrow a Java term.

Is there anything anywhere in the language specification or other documents that indicates if this is a legitimate thing to do, however? I could see it going either way, and the above quoted document is not clear on exactly what consititutes a "read". The documentation seems totally quiet on the concurrency aspects of the for/range statement.

(Please note this question is about the currency of for/range, but not a duplicate of: Golang concurrent map access with range - the use case is completely different and I am asking about the precise locking requirement wrt the 'range' keyword here!)

249

asked Nov 05 '16 20:11

BadZen

2 Answers

You are using a for statement with a range expression. Quoting from Spec: For statements:

The range expression is evaluated once before beginning the loop, with one exception: if the range expression is an array or a pointer to an array and at most one iteration variable is present, only the range expression's length is evaluated; if that length is constant, by definition the range expression itself will not be evaluated.

We're ranging over a map, so it's not an exception: the range expression is evaluated only once before beginning the loop. The range expression is simply a map variable testMap:

for k, _ := range testMap {}

The map value does not include the key-value pairs, it only points to a data structure that does. Why is this important? Because the map value is only evaluated once, and if later pairs are added to the map, the map value –evaluated once before the loop– will be a map that still points to a data structure that includes those new pairs. This is in contrast to ranging over a slice (which would be evaluated once too), which is also only a header pointing to a backing array holding the elements; but if elements are added to the slice during the iteration, even if that does not result in allocating and copying over to a new backing array, they will not be included in the iteration (because the slice header also contains the length - already evaluated). Appending elements to a slice may result in a new slice value, but adding pairs to a map will not result in a new map value.

Now on to iteration:

for k, v := range testMap {
    t1 := time.Now()
    someFunction()
    t2 := time.Now()
}

Before we enter into the block, before the t1 := time.Now() line k and v variables are holding the values of the iteration, they are already read out from the map (else they couldn't hold the values). Question: do you think the map is read by the for ... range statement between t1 and t2? Under what circumstances could that happen? We have here a single goroutine that is executing someFunc(). To be able to access the map by the for statement, that would either require another goroutine, or it would require to suspend someFunc(). Obviously neither of those happen. (The for ... range construct is not a multi-goroutine monster.) No matter how many iterations there are, while someFunc() is executed, the map is not accessed by the for statement.

So to answer one of your questions: the map is not accessed inside the for block when executing an iteration, but it is accessed when the k and v values are set (assigned) for the next iteration. This implies that the following iteration over the map is safe for concurrent access:

var (
    testMap         = make(map[int]int)
    testMapLock     = &sync.RWMutex{}
)

func IterateMapKeys(iteratorChannel chan int) error {
    testMapLock.RLock()
    defer testMapLock.RUnlock()
    for k, v := range testMap {
        testMapLock.RUnlock()
        someFunc()
        testMapLock.RLock()
        if someCond {
            return someErr
        }
    }
    return nil
}

Note that unlocking in IterateMapKeys() should (must) happen as a deferred statement, as in your original code you may return "early" with an error, in which case you didn't unlock, which means the map remained locked! (Here modeled by if someCond {...}).

Also note that this type of locking only ensures locking in case of concurrent access. It does not prevent a concurrent goroutine to modify (e.g. add a new pair) the map. The modification (if properly guarded with write lock) will be safe, and the loop may continue, but there is no guarantee that the for loop will iterate over the new pair:

If map entries that have not yet been reached are removed during iteration, the corresponding iteration values will not be produced. If map entries are created during iteration, that entry may be produced during the iteration or may be skipped. The choice may vary for each entry created and from one iteration to the next.

The write-lock-guarded modification may look like this:

func WriteTestMap(k, v int) {
    testMapLock.Lock()
    defer testMapLock.Unlock()
    testMap[k] = v
}

Now if you release the read lock in the block of the for, a concurrent goroutine is free to grab the write lock and make modifications to the map. In your code:

testMapLock <- true
iteratorChannel <- k
<-testMapLock

When sending k on the iteratorChannel, a concurrent goroutine may modify the map. This is not just an "unlucky" scenario, sending a value on a channel is often a "blocking" operation, if the channel's buffer is full, another goroutine must be ready to receive in order for the send operation to proceed. Sending a value on a channel is a good scheduling point for the runtime to run other goroutines even on the same OS thread, not to mention if there are multiple OS threads, of which one may already be "waiting" for the write lock in order to carry out a map modification.

To sum the last part: you releasing the read lock inside the for block is like yelling to others: "Come, modify the map now if you dare!" Consequently in your code encountering that mySeq != testMapSequence is very likely. See this runnable example to demonstrate it (it's a variation of your example):

package main

import (
    "fmt"
    "math/rand"
    "sync"
)

var (
    testMap         = make(map[int]int)
    testMapLock     = &sync.RWMutex{}
    testMapSequence int
)

func main() {
    go func() {
        for {
            k := rand.Intn(10000)
            WriteTestMap(k, 1)
        }
    }()

    ic := make(chan int)
    go func() {
        for _ = range ic {
        }
    }()

    for {
        if err := IterateMapKeys(ic); err != nil {
            fmt.Println(err)
        }
    }
}

func WriteTestMap(k, v int) {
    testMapLock.Lock()
    defer testMapLock.Unlock()
    testMap[k] = v
    testMapSequence++
}

func IterateMapKeys(iteratorChannel chan int) error {
    testMapLock.RLock()
    defer testMapLock.RUnlock()
    mySeq := testMapSequence
    for k, _ := range testMap {
        testMapLock.RUnlock()
        iteratorChannel <- k
        testMapLock.RLock()
        if mySeq != testMapSequence {
            //close(iteratorChannel)
            return fmt.Errorf("concurrent modification %d", testMapSequence)
        }
    }
    return nil
}

Example output:

concurrent modification 24
concurrent modification 41
concurrent modification 463
concurrent modification 477
concurrent modification 482
concurrent modification 496
concurrent modification 508
concurrent modification 521
concurrent modification 525
concurrent modification 535
concurrent modification 541
concurrent modification 555
concurrent modification 561
concurrent modification 565
concurrent modification 570
concurrent modification 577
concurrent modification 591
concurrent modification 593

We're encountering concurrent modification quite often!

Do you want to avoid this kind of concurrent modification? The solution is quite simple: don't release the read lock inside the for. Also run your app with the -race option to detect race conditions: go run -race testmap.go

Final thoughts

The language spec clearly allows you to modify the map in the same goroutine while ranging over it, this is what the previous quote relates to ("If map entries that have not yet been reached are removed during iteration.... If map entries are created during iteration..."). Modifying the map in the same goroutine is allowed and is safe, but how it is handled by the iterator logic is not defined.

If the map is modified in another goroutine, if you use proper synchronization, The Go Memory Model guarantees that the goroutine with the for ... range will observe all modifications, and the iterator logic will see it just as if "its own" goroutine would have modified it – which is allowed as stated before.

185

answered Nov 10 '22 08:11

icza

The unit of concurrent access for a for range loop over a map is the map. Go maps in action.

A map is a dynamic data structure that changes for inserts, updates and deletes. Inside the Map Implementation. For example,

The iteration order over maps is not specified and is not guaranteed to be the same from one iteration to the next. If map entries that have not yet been reached are removed during iteration, the corresponding iteration values will not be produced. If map entries are created during iteration, that entry may be produced during the iteration or may be skipped. The choice may vary for each entry created and from one iteration to the next. If the map is nil, the number of iterations is 0. For statements, The Go Programming Language Specification

Reading a map with a for range loop with interleaved inserts, updates and deletes is unlikely to be useful.

Lock the map:

package main

import (
    "sync"
)

var racer map[int]int

var race sync.RWMutex

func Reader() {
    race.RLock() // Lock map
    for k, v := range racer {
        _, _ = k, v
    }
    race.RUnlock()
}

func Write() {
    for i := 0; i < 1e6; i++ {
        race.Lock()
        racer[i/2] = i
        race.Unlock()
    }
}

func main() {
    racer = make(map[int]int)
    Write()
    go Write()
    Reader()
}

Don't lock after the read -- fatal error: concurrent map iteration and map write:

package main

import (
    "sync"
)

var racer map[int]int

var race sync.RWMutex

func Reader() {
    for k, v := range racer {
        race.RLock() // Lock after read
        _, _ = k, v
        race.RUnlock()
    }
}

func Write() {
    for i := 0; i < 1e6; i++ {
        race.Lock()
        racer[i/2] = i
        race.Unlock()
    }
}

func main() {
    racer = make(map[int]int)
    Write()
    go Write()
    Reader()
}

Use the Go Data Race Detector. Read Introducing the Go Race Detector.

answered Nov 10 '22 08:11

peterSO

Related questions
                            
                                int, short, byte performance in back-to-back for-loops
                            
                                How do I convert a doubly recursive method to a loop?
                            
                                Why are structured bindings in range-based for-loop just a copy and not a reference?
                            
                                Why does list(next(iter(())) for _ in range(1)) == []?
                            
                                Redo for loop iteration in Python
                            
                                Understanding nested for loops in javascript
                            
                                Does C++11 for loop allow new or better optimizations?
                            
                                Redirect output of command in for loop of batch script
                            
                                How can I use a list comprehension to extend a list in python? [duplicate]
                            
                                Creating N nested for-loops
                            
                                Iterate through multiple collections in the same "for" loop?
                            
                                why does filter have to be defined for pattern matching in a for loop in scala?
                            
                                Pythonic shortcut for doubly nested for loops?
                            
                                How do I rename a data-frame in a for-loop
                            
                                For loop goes out of range
                            
                                Same function over multiple data frames in R
                            
                                How to Write the Range-based For-Loop With Argv?
                            
                                Using a loop in Python to name variables
                            
                                Break in nested for loops [duplicate]
                            
                                How to write a for loop that will pick up a count where it left off?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With