I wrote a script that has multiple threads (created with <code>threading.Thread</code>) fetching URLs from a <code>Queue</code> using <code>queue.get_nowait()</code>, and then processing the HTML. I am new to multi-threaded programming, and am having trouble understanding the purpose of the <code>queue.task_done()</code> function. When the <code>Queue</code> is empty, it automatically returns the <code>queue.Empty</code> exception. So I don't understand the need for each thread to call the <code>task_done()</code> function. We know that we're done with the queue when its empty, so why do we need to notify it that the worker threads have finished their work (which has nothing to do with the queue, after they've gotten the URL from it)? Could someone provide me with a code example (ideally using <code>urllib</code>, file I/O, or something other than fibonacci numbers and printing "Hello") that shows me how this function would be used in practical applications?

<code>Queue.task_done</code> is not there for the workers' benefit. It is there to support <code>Queue.join</code>. <hr> If I give you a box of work assignments, do I care about when you've taken everything out of the box? No. I care about when the work is done. Looking at an empty box doesn't tell me that. You and 5 other guys might still be working on stuff you took out of the box. <code>Queue.task_done</code> lets workers say when a task is done. Someone waiting for all the work to be done with <code>Queue.join</code> will wait until enough <code>task_done</code> calls have been made, not when the queue is empty. <hr> eigenfield points out in the comments that it seems really weird for a queue to have <code>task_done</code>/<code>join</code> methods. That's true, but it's really a naming problem. The <code>queue</code> module has bad name choices that make it sound like a general-purpose queue library, when it's really a thread communication library. It'd be weird for a general-purpose queue to have <code>task_done</code>/<code>join</code> methods, but it's entirely reasonable for an inter-thread message channel to have a way to indicate that messages have been processed. If the class was called <code>thread_communication.MessageChannel</code> instead of <code>queue.Queue</code> and <code>task_done</code> was called <code>message_processed</code>, the intent would be a lot clearer. (If you need a general-purpose queue rather than an inter-thread message channel, use <code>collections.deque</code>.)

Python - What is queue.task_done() used for?

Tags:

I wrote a script that has multiple threads (created with threading.Thread) fetching URLs from a Queue using queue.get_nowait(), and then processing the HTML. I am new to multi-threaded programming, and am having trouble understanding the purpose of the queue.task_done() function.

When the Queue is empty, it automatically returns the queue.Empty exception. So I don't understand the need for each thread to call the task_done() function. We know that we're done with the queue when its empty, so why do we need to notify it that the worker threads have finished their work (which has nothing to do with the queue, after they've gotten the URL from it)?

Could someone provide me with a code example (ideally using urllib, file I/O, or something other than fibonacci numbers and printing "Hello") that shows me how this function would be used in practical applications?

309

asked Apr 03 '18 18:04

J. Taylor

1 Answers

Queue.task_done is not there for the workers' benefit. It is there to support Queue.join.

If I give you a box of work assignments, do I care about when you've taken everything out of the box?

No. I care about when the work is done. Looking at an empty box doesn't tell me that. You and 5 other guys might still be working on stuff you took out of the box.

Queue.task_done lets workers say when a task is done. Someone waiting for all the work to be done with Queue.join will wait until enough task_done calls have been made, not when the queue is empty.

eigenfield points out in the comments that it seems really weird for a queue to have task_done/join methods. That's true, but it's really a naming problem. The queue module has bad name choices that make it sound like a general-purpose queue library, when it's really a thread communication library.

It'd be weird for a general-purpose queue to have task_done/join methods, but it's entirely reasonable for an inter-thread message channel to have a way to indicate that messages have been processed. If the class was called thread_communication.MessageChannel instead of queue.Queue and task_done was called message_processed, the intent would be a lot clearer.

(If you need a general-purpose queue rather than an inter-thread message channel, use collections.deque.)

answered Oct 06 '22 01:10

user2357112 supports Monica

Related questions
                            
                                How to reduce the margin between 'leading' and 'title' for ListTile ? Flutter
                            
                                How to fix selenium "DevToolsActivePort file doesn't exist" exception in Python [duplicate]
                            
                                Flutter problems: "Could not resolve all artifacts for configuration ':classpath' "
                            
                                Why is it not a good idea to use SOAP for communicating with the front end (ie web browser)?
                            
                                In TDD, what is the advantage of running the tests before even writing an empty method?
                            
                                Sort array of items using OrderBy<>
                            
                                mysql dump tables only
                            
                                Sorting in linear time? [closed]
                            
                                String.format with lazy evaluation
                            
                                Add Library to Visual Studio 2008 C++ Project
                            
                                Should I make this XmlSerializer static?
                            
                                How do I format a date to mm/dd/yyyy in Ruby?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With