I know that python lists are thread safe. However, let's suppose I have a shared list
object_list = []
Now, a thread periodically adds an object (a data structure) to this list,
object_list.add(aNewObject)
Another thread, when invoked, accesses this list in a for loop
for i in object_list:
#do something with i
Do I need something like this:
myLock.acquire()
either add an element or access the list with the for
myLock.release()
Or I am fine anyway? I don't care about missing an add, but I am worried about what happens if a new element is added when the reading thread is iterating in the middle of the list...
It depends on what guarantees you expect. First of all, the python language does not guarantee any thread-safety.
The CPython implementation incidentally offer a bit of thread-safety due to its GIL, but if you ever plan to run the code under any other python implementation you should just use locks all over the place whenever you access the resource.
Now, for a CPython specific consideration, if you know that the code will add elements only via append
or extend
, then the worst that can happen is that the for
loop will iterate also over the new elements. This might or might not be something you want1:
>>> a = [0]
>>> for i, el in enumerate(a):
... if i % 2 == 0:
... a.append(i)
... print(i, el)
...
0 0
1 0
>>> a
[0, 0]
>>> for i, el in enumerate(a):
... if i % 2 == 0:
... a.append(i)
... print(i, el)
...
0 0
1 0
2 0
3 2
>>> a
[0, 0, 0, 2]
If new items are added using the insert
method then it's possible that some values are iterated over more then once, while some others aren't iterated over during the for
. For example:
>>> a = [1]
>>> for i,el in enumerate(a):
... if i % 2 == 0:
... a.insert(0, i)
... print(i, el)
...
(0, 1)
(1, 1)
>>> a
[0, 1]
Here the 1
is iterated over twice and the newly inserted 0
is never seen in the loop
.
Also removing elements from a list can result in similar behaviour where some elements are skipped:
>>> for i, el in enumerate(a):
... if i % 2 == 0:
... a.remove(i)
... print(i, el)
...
(0, 0)
>>> a
[1]
As you can see the 1
is never seen during the loop.
This said mutating a sequence during iteration is usually something you don't want. In your case I'd recommend using a lock. Note that locks are context managers so I'd write the loop as:
with list_lock:
for element in object_list:
# do stuff
This simplifies the code since you don't have to remember to release the lock explicitly. The same should be done when inserting the element (otherwise it's useless):
with list_lock:
object_list.append(new_element)
1 I don't use separate threads to simplify the example and make it easily reproducible. In CPython having the operations done in multiple threads shouldn't change the behaviour you see. This is not true in other implementations. Obviously with multiple threads is much harder to reproduce the behaviour.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With