Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Understand python threading bug

Reading http://bugs.python.org/msg160297, I can see a simple script written by Stephen White which demonstrates how python threading bugs up with this exception

Exception AttributeError: AttributeError("'_DummyThread' object has no attribute '_Thread__block'",) in <module 'threading' 

Given Stephen White's source code (http://bugs.python.org/file25511/bad-thread.py),

import os
import thread
import threading
import time

def t():
    threading.currentThread() # Populate threading._active with a DummyThread
    time.sleep(3)

thread.start_new_thread(t, ())

time.sleep(1)

pid = os.fork()
if pid == 0:
    os._exit(0)

os.waitpid(pid, 0)

how would we re-write it so that this error is resolved?

like image 522
Calvin Cheng Avatar asked Nov 02 '12 10:11

Calvin Cheng


People also ask

What is the problem with multithreading in Python?

Because of the way CPython implementation of Python works, threading may not speed up all tasks. This is due to interactions with the GIL that essentially limit one Python thread to run at a time. Tasks that spend much of their time waiting for external events are generally good candidates for threading.

How does Python threading work?

Multithreading (sometimes simply "threading") is when a program creates multiple threads with execution cycling among them, so one longer-running task doesn't block all the others. This works well for tasks that can be broken down into smaller subtasks, which can then each be given to a thread to be completed.

Is Python threading really threading?

Python is NOT a single-threaded language. Python processes typically use a single thread because of the GIL. Despite the GIL, libraries that perform computationally heavy tasks like numpy, scipy and pytorch utilise C-based implementations under the hood, allowing the use of multiple cores.

Is Python good at multithreading?

Python doesn't support multi-threading because Python on the Cpython interpreter does not support true multi-core execution via multithreading. However, Python does have a threading library. The GIL does not prevent threading.


1 Answers

The bug occurs because of a bad interaction between dummy thread objects created by the threading API when one calls threading.currentThread() on a foreign thread, and the threading._after_fork function, called to clean up resources after a call to os.fork().

To work around the bug without modifying Python's source, monkey-patch threading._DummyThread with a no-op implementation of __stop:

import threading
threading._DummyThread._Thread__stop = lambda x: 42

The cause of the bug is best narrowed down in comments by Richard Oudkerk and cooyeah. What happens is the following:

  1. The threading module allows threading.currentThread() to be called from a thread not created by the threading API calls. It then returns a "dummy thread" instance which supports a very limited subset of the Thread API, but is still useful for identifying the current thread.

  2. threading._DummyThread is implemented as a subclass of Thread. Thread instances normally contain an internal callable (self.__block) that keeps reference to an OS-level lock allocated for the instance. Since public Thread methods that might end up using self.__block are all overridden by _DummyThread, _DummyThread's constructor intentionally releases the OS-level lock by deleting self.__block.

  3. threading._after_fork breaks the encapsulation and calls the private Thread.__stop method on all registered threads, including the dummy ones, where __stop was never meant to be invoked. (They weren't started by Python, so their stopping is not managed by Python either.) As the dummy threads don't know about __stop, they inherit it from Thread, and that implementation happily accesses the private __block attribute that doesn't exist in _DummyThread instances. This access finally causes the error.

The bug is fixed in the 2.7 branch by modifying Thread.__stop not to break when __block is deleted. The 3.x branch, where __stop is spelled as _stop and therefore protected, fixes it by overriding _DummyThread's _stop to do nothing.

like image 137
user4815162342 Avatar answered Sep 19 '22 07:09

user4815162342