I have a generator that will generate more than 1 trillion strings, and I would like to put them in a queue, and let a pool of workers to consume the queue. However I couldn't afford to put the whole 1 trillion strings in my memory and map them to threads.
Generator is very fast, consumption worker is not. I need to maintain the length of queue at a certain level to not blow up my memory. That means I need to figure out a way to pause and restart feeding the queue.
Could anyone provide a hint or so how to accomplish this task in Python 3.4?
You can specify the maximum size of the queue:
q = queue.Queue(10) # max size of the queue is 10
When the queue has attained the maximum size new insertions will block until items are removed from the queue.
Your generator thread can just generate items and put them on the queue. If it gets too far ahead of the consumer threads it will just block.
while not done:
e = generate next item
q.put(e) # will block if queue is full
See:
https://docs.python.org/3/library/queue.html
for more info.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With