I have a Python application that, to be brief, receives data from a remote server, processes it, responds to the server, and occasionally saves the processed data to disk. The problem I've encountered is that there is a lot of data to write, and the save process can take upwards of half a minute. This is apparently a blocking operation, so the network IO is stalled during this time. I'd like to be able to make the save operation take place in the background, so-to-speak, so that the application can continue to communicate with the server reasonably quickly.
I know that I probably need some kind of threading module to accomplish this, but I can't tell what the differences are between thread
, threading
, multiprocessing
, and the various other options. Does anybody know what I'm looking for?
Since you're I/O bound, then use the threading
module.
You should almost never need to use thread
, it's a low-level interface; the threading
module is a high-level interface wrapper for thread
.
The multiprocessing
module is different from the threading module, multiprocessing
uses multiple subprocesses to execute a task; multiprocessing
just happens to use the same interface as threading
to reduce learning curve. multiprocessing
is typically used when you have CPU bound calculation, and need to avoid the GIL (Global Interpreter Lock) in a multicore CPU.
A somewhat more esoteric alternative to multi-threading is asynchronous I/O using asyncore
module. Another options includes Stackless Python and Twisted.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With