I have a pretty basic understanding of multithreading in Python and an even basic-er understanding of asyncio
.
I'm currently writing a small Curses-based program (eventually going to be using a full GUI, but that's another story) that handles the UI and user IO in the main thread, and then has two other daemon threads (each with their own queue/worker-method-that-gets-things-from-a-queue):
watcher
thread that watches for time-based and conditional (e.g. posts to a message board, received messages, etc.) events to occur and then puts required tasks into...worker
) daemon thread's queue which then completes them.All three threads are continuously running concurrently, which leads me to some questions:
worker
thread's queue (or, more generally, any thread's queue) is empty, should it be stopped until is has something to do again, or is it okay to leave continuously running? Do concurrent threads take up a lot of processing power when they aren't doing anything other than watching its queue?watcher
thread is continuously running a single method, I guess the worker
thread would be able to just pull tasks from the single queue that the watcher
thread puts in.watcher
thread be running continuously like that? From what I understand, and please correct me if I'm wrong, asyncio
is supposed to be used for event-based multithreading, which seems relevant to what I'm trying to do.asyncio
would be perfect for, but, again, I'm not sure.Thanks!
One of the cool advantages of asyncio is that it scales far better than threading . Each task takes far fewer resources and less time to create than a thread, so creating and running more of them works well. This example just creates a separate task for each site to download, which works out quite well.
asyncio essentially provides significantly more control over concurrency at the cost of you need to take control of the concurrency more.
For single processes you're right, but this article (and a lot of the activity around asyncio in Python) is about backend webdev, where you're already running multiple app servers. In this context, asyncio is almost always slower.
You can run multiple event loops on different threads. A single thread helps us to achieve better performance as compared to what we have achieved with multi-threading, also it is easy or clean to write async code in Python than multi-threaded code.
Multithreading with threading module is preemptive, which entails voluntary and involuntary swapping of threads. AsyncIO is a single thread single process cooperative multitasking.
The major advantage of asyncio approach vs green-threads approach is that with asyncio we have cooperative multitasking and places in code where some task can yield control are clearly indicated by await, async with and async for. It makes it easier to reason about common concurrency problem of data races.
It’s because asyncio is more robust with task scheduling and provides the user with full control of code execution. You can pause the code by using the await keyword and during the wait, you could run nothing or go ahead executing other code. As a result, resources are not locked down during the wait.
But think about it. Both multi-threading and asyncio are meant for “parallel” jobs, and they both have fired up extra resources to complete the jobs faster. Using either of them to read one URL won’t save your time if not waste your time.
When the worker thread's queue (or, more generally, any thread's queue) is empty, should it be stopped until is has something to do again, or is it okay to leave continuously running? Do concurrent threads take up a lot of processing power when they aren't doing anything other than watching its queue?
You should just use a blocking call to queue.get()
. That will leave the thread blocked on I/O, which means the GIL will be released, and no processing power (or at least a very minimal amount) will be used. Don't use non-blocking gets in a while
loop, since that's going to require a lot more CPU wakeups.
Should the two threads' queues be combined? Since the watcher thread is continuously running a single method, I guess the worker thread would be able to just pull tasks from the single queue that the watcher thread puts in.
If all the watcher is doing is pulling things off a queue and immediately putting it into another queue, where it gets consumed by a single worker, it sounds like its unnecessary overhead - you may as well just consume it directly in the worker. It's not exactly clear to me if that's the case, though - is the watcher consuming from a queue, or just putting items into one? If it is consuming from a queue, who is putting stuff into it?
I don't think it'll matter since I'm not multiprocessing, but is this setup affected by Python's GIL (which I believe still exists in 3.4) in any way?
Yes, this is affected by the GIL. Only one of your threads can run Python bytecode at a time, so won't get true parallelism, except when threads are running I/O (which releases the GIL). If your worker thread is doing CPU-bound activities, you should seriously consider running it in a separate process via multiprocessing
, if possible.
Should the watcher thread be running continuously like that? From what I understand, and please correct me if I'm wrong, asyncio is supposed to be used for event-based multithreading, which seems relevant to what I'm trying to do.
It's hard to say, because I don't know exactly what "running continuously" means. What is it doing continuously? If it spends most of its time sleeping or blocking on a queue
, it's fine - both of those things release the GIL. If it's constantly doing actual work, that will require the GIL, and therefore degrade the performance of the other threads in your app (assuming they're trying to do work at the same time). asyncio
is designed for programs that are I/O-bound, and can therefore be run in a single thread, using asynchronous I/O. It sounds like your program may be a good fit for that depending on what your worker
is doing.
The main thread is basically always just waiting for the user to press a key to access a different part of the menu. This seems like a situation asyncio would be perfect for, but, again, I'm not sure.
Any program where you're mostly waiting for I/O is potentially a good for for asyncio
- but only if you can find a library that makes curses (or whatever other GUI library you eventually choose) play nicely with it. Most GUI frameworks come with their own event loop, which will conflict with asyncio
's. You would need to use a library that can make the GUI's event loop play nicely with asyncio
's event loop. You'd also need to make sure that you can find asyncio
-compatible versions of any other synchronous-I/O based library your application uses (e.g. a database driver).
That said, you're not likely to see any kind of performance improvement by switching from your thread-based program to something asyncio
-based. It'll likely perform about the same. Since you're only dealing with 3 threads, the overhead of context switching between them isn't very significant, so switching from that a single-threaded, asynchronous I/O approach isn't going to make a very big difference. asyncio
will help you avoid thread synchronization complexity (if that's an issue with your app - it's not clear that it is), and at least theoretically, would scale better if your app potentially needed lots of threads, but it doesn't seem like that's the case. I think for you, it's basically down to which style you prefer to code in (assuming you can find all the asyncio
-compatible libraries you need).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With