Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Threading in Python [closed]

What are the modules used to write multi-threaded applications in Python? I'm aware of the basic concurrency mechanisms provided by the language and also of Stackless Python, but what are their respective strengths and weaknesses?

like image 304
Jon Avatar asked Jul 27 '09 19:07

Jon


People also ask

Do Python threads close themselves?

In Python, any alive non-daemon thread blocks the main program to exit. Whereas, daemon threads themselves are killed as soon as the main program exits. In other words, as soon as the main program exits, all the daemon threads are killed.

How do you close a thread within?

exit() is executed from within a thread it will close that thread only.

What are the limitations of threading in Python?

In fact, a Python process cannot run threads in parallel but it can run them concurrently through context switching during I/O bound operations. This limitation is actually enforced by GIL. The Python Global Interpreter Lock (GIL) prevents threads within the same process to be executed at the same time.

How do you start and stop a thread in Python?

Event can be checked via the is_set() function. The main thread, or another thread, can then set the event in order to stop the new thread from running. The event can be set or made True via the set() function. Now that we know how to stop a Python thread, let's look at some worked examples.


1 Answers

In order of increasing complexity:

Use the threading module

Pros:

  • It's really easy to run any function (any callable in fact) in its own thread.
  • Sharing data is if not easy (locks are never easy :), at least simple.

Cons:

  • As mentioned by Juergen Python threads cannot actually concurrently access state in the interpreter (there's one big lock, the infamous Global Interpreter Lock.) What that means in practice is that threads are useful for I/O bound tasks (networking, writing to disk, and so on), but not at all useful for doing concurrent computation.

Use the multiprocessing module

In the simple use case this looks exactly like using threading except each task is run in its own process not its own thread. (Almost literally: If you take Eli's example, and replace threading with multiprocessing, Thread, with Process, and Queue (the module) with multiprocessing.Queue, it should run just fine.)

Pros:

  • Actual concurrency for all tasks (no Global Interpreter Lock).
  • Scales to multiple processors, can even scale to multiple machines.

Cons:

  • Processes are slower than threads.
  • Data sharing between processes is trickier than with threads.
  • Memory is not implicitly shared. You either have to explicitly share it or you have to pickle variables and send them back and forth. This is safer, but harder. (If it matters increasingly the Python developers seem to be pushing people in this direction.)

Use an event model, such as Twisted

Pros:

  • You get extremely fine control over priority, over what executes when.

Cons:

  • Even with a good library, asynchronous programming is usually harder than threaded programming, hard both in terms of understanding what's supposed to happen and in terms of debugging what actually is happening.

In all cases I'm assuming you already understand many of the issues involved with multitasking, specifically the tricky issue of how to share data between tasks. If for some reason you don't know when and how to use locks and conditions you have to start with those. Multitasking code is full of subtleties and gotchas, and it's really best to have a good understanding of concepts before you start.

like image 172
quark Avatar answered Sep 19 '22 17:09

quark