Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Are Generators Threadsafe?

I have a multithreaded program where I create a generator function and then pass it to new threads. I want it to be shared/global in nature so each thread can get the next value from the generator.

Is it safe to use a generator like this, or will I run into problems/conditions accessing the shared generator from multiple threads?

If not, is there a better way to approach the problem? I need something that will cycle through a list and produce the next value for whichever thread calls it.

like image 429
Corey Goldberg Avatar asked Jul 15 '09 13:07

Corey Goldberg


People also ask

Is generator thread safe?

It's not thread-safe; simultaneous calls may interleave, and mess with the local variables. The common approach is to use the master-slave pattern (now called farmer-worker pattern in PC).

Is random seed thread safe?

Both the functions on the “random” module are thread safe as are the methods on an instance of the random. Random class. This means that a multithreaded program may call module functions in order to generate random numbers with a seed and sequence of random numbers shared between threads, or have a single new random.

Are Python iterators thread safe?

Iterators are still not threadsafe. The solution to this iteration problem will be to acquire the collection's lock when you need to iterate over it, which we'll talk about in a future reading.

What is yield from in python?

What Is Yield In Python? The Yield keyword in Python is similar to a return statement used for returning values or objects in Python. However, there is a slight difference. The yield statement returns a generator object to the one who calls the function which contains yield, instead of simply returning a value.


3 Answers

It's not thread-safe; simultaneous calls may interleave, and mess with the local variables.

The common approach is to use the master-slave pattern (now called farmer-worker pattern in PC). Make a third thread which generates data, and add a Queue between the master and the slaves, where slaves will read from the queue, and the master will write to it. The standard queue module provides the necessary thread safety and arranges to block the master until the slaves are ready to read more data.

like image 128
Martin v. Löwis Avatar answered Oct 17 '22 04:10

Martin v. Löwis


Edited to add benchmark below.

You can wrap a generator with a lock. For example,

import threading
class LockedIterator(object):
    def __init__(self, it):
        self.lock = threading.Lock()
        self.it = it.__iter__()

    def __iter__(self): return self

    def next(self):
        self.lock.acquire()
        try:
            return self.it.next()
        finally:
            self.lock.release()

gen = [x*2 for x in [1,2,3,4]]
g2 = LockedIterator(gen)
print list(g2)

Locking takes 50ms on my system, Queue takes 350ms. Queue is useful when you really do have a queue; for example, if you have incoming HTTP requests and you want to queue them for processing by worker threads. (That doesn't fit in the Python iterator model--once an iterator runs out of items, it's done.) If you really do have an iterator, then LockedIterator is a faster and simpler way to make it thread safe.

from datetime import datetime
import threading
num_worker_threads = 4

class LockedIterator(object):
    def __init__(self, it):
        self.lock = threading.Lock()
        self.it = it.__iter__()

    def __iter__(self): return self

    def next(self):
        self.lock.acquire()
        try:
            return self.it.next()
        finally:
            self.lock.release()

def test_locked(it):
    it = LockedIterator(it)
    def worker():
        try:
            for i in it:
                pass
        except Exception, e:
            print e
            raise

    threads = []
    for i in range(num_worker_threads):
        t = threading.Thread(target=worker)
        threads.append(t)
        t.start()

    for t in threads:
        t.join()

def test_queue(it):
    from Queue import Queue
    def worker():
        try:
            while True:
                item = q.get()
                q.task_done()
        except Exception, e:
            print e
            raise

    q = Queue()
    for i in range(num_worker_threads):
         t = threading.Thread(target=worker)
         t.setDaemon(True)
         t.start()

    t1 = datetime.now()

    for item in it:
        q.put(item)

    q.join()

start_time = datetime.now()
it = [x*2 for x in range(1,10000)]

test_locked(it)
#test_queue(it)
end_time = datetime.now()
took = end_time-start_time
print "took %.01f" % ((took.seconds + took.microseconds/1000000.0)*1000)
like image 54
Glenn Maynard Avatar answered Oct 17 '22 03:10

Glenn Maynard


No, they are not thread-safe. You can find interesting info about generators and multi-threading in:

http://www.dabeaz.com/generators/Generators.pdf

like image 6
Mikhail Churbanov Avatar answered Oct 17 '22 03:10

Mikhail Churbanov