Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is a Python I/O bound task not blocked by the GIL?

The python threading documentation states that "...threading is still an appropriate model if you want to run multiple I/O-bound tasks simultaneously", apparently because I/O-bound processes can avoid the GIL that prevents threads from concurrent execution in CPU-bound tasks.

But what I dont understand is that an I/O task still uses the CPU. So how could it not encounter the same issues? Is it because the I/O bound task will not require memory management?

like image 927
Terrence Brannon Avatar asked Mar 26 '15 03:03

Terrence Brannon


People also ask

How does threading work in Python with GIL?

The GIL allows only one OS thread to execute Python bytecode at any given time, and the consequence of this is that it's not possible to speed up CPU-intensive Python code by distributing the work among multiple threads. This is, however, not the only negative effect of the GIL.

Will Python GIL be removed?

The GIL has long been seen as an obstacle to better multithreaded performance in CPython (and thus Python generally). Many efforts have been made to remove it over the years, but at the cost of hurting single-threaded performance—in other words, by making the vast majority of existing Python applications slower.

What is GIL in Python and how does it work?

In CPython, the global interpreter lock, or GIL, is a mutex that protects access to Python objects, preventing multiple threads from executing Python bytecodes at once. The GIL prevents race conditions and ensures thread safety.


2 Answers

All of Python's blocking I/O primitives release the GIL while waiting for the I/O block to resolve -- it's as simple as that! They will of course need to acquire the GIL again before going on to execute further Python code, but for the long-in-terms-of-machine-cycles intervals in which they're just waiting for some I/O syscall, they don't need the GIL, so they don't hold on to it!

like image 155
Alex Martelli Avatar answered Sep 19 '22 06:09

Alex Martelli


The GIL in CPython1 is only concerned with Python code being executed. A thread-safe C extension that uses a lot of CPU might release the GIL as long as it doesn't need to interact with the Python runtime.

As soon as the C code needs to 'talk' to Python (read: call back into the Python runtime) then it needs to acquire the GIL again - that is, the GIL is to establish protection/atomic behavior for the "interpreter" (and I use the term loosely) and is not to prevent native/non-Python code from running concurrently.

Releasing the GIL around I/O (blocking or not, using CPU or not) is the same thing - until the data is moved into Python there is no reason to acquire the GIL.


1 The GIL is controversial because it prevents multithreaded CPython programs from taking full advantage of multiprocessor systems in certain situations. Note that potentially blocking or long-running operations, such as I/O, image processing, and NumPy number crunching, happen outside the GIL. Therefore it is only in multithreaded programs that spend a lot of time inside the GIL, interpreting CPython bytecode, that the GIL becomes a bottleneck.

like image 35
user2864740 Avatar answered Sep 17 '22 06:09

user2864740