Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is a python thread

I have several questions regarding Python threads.

  1. Is a Python thread a Python or OS implementation?
  2. When I use htop a multi-threaded script has multiple entries - the same memory consumption, the same command but a different PID. Does this mean that a [Python] thread is actually a special kind of process? (I know there is a setting in htop to show these threads as one process - Hide userland threads)
  3. Documentation says:

A thread can be flagged as a “daemon thread”. The significance of this flag is that the entire Python program exits when only daemon threads are left.

My interpretation/understanding was: main thread terminates when all non-daemon threads are terminated.

So python daemon threads are not part of Python program if "the entire Python program exits when only daemon threads are left"?

like image 772
warvariuc Avatar asked Dec 24 '11 08:12

warvariuc


People also ask

How do you run a thread in Python?

Use the Python threading module to create a multi-threaded application. Use the Thread(function, args) to create a new thread. Call the start() method of the Thread class to start the thread. Call the join() method of the Thread class to wait for the thread to complete in the main thread.

How are Python threads executed?

The threading module provided with Python includes a simple-to-implement locking mechanism that allows you to synchronize threads. A new lock is created by calling the Lock() method, which returns the new lock. The acquire(blocking) method of the new lock object is used to force threads to run synchronously.

What is single thread in Python?

This means that in python only one thread will be executed at a time. By only allowing a single thread to be used every time we run a Python process, this ensures that only one thread can access a particular resource at a time and it also prevents the use of objects and bytecodes at once.

How many types of threads are there in Python?

There are two distinct types of thread. These are: User-level threads: These are the ones we can actively play with within our code etc. Kernel-level threads: These are very low-level threads that act on behalf of the operating system.


2 Answers

  1. Python threads are implemented using OS threads in all implementations I know (C Python, PyPy and Jython). For each Python thread, there is an underlying OS thread.

  2. Some operating systems (Linux being one of them) show all different threads launched by the same executable in the list of all running processes. This is an implementation detail of the OS, not of Python. On some other operating systems, you may not see those threads when listing all the processes.

  3. The process will terminate when the last non-daemon thread finishes. At that point, all the daemon threads will be terminated. So, those threads are part of your process, but are not preventing it from terminating (while a regular thread will prevent it). That is implemented in pure Python. A process terminates when the system _exit function is called (it will kill all threads), and when the main thread terminates (or sys.exit is called), the Python interpreter checks if there is another non-daemon thread running. If there is none, then it calls _exit, otherwise it waits for the non-daemon threads to finish.


The daemon thread flag is implemented in pure Python by the threading module. When the module is loaded, a Thread object is created to represent the main thread, and it's _exitfunc method is registered as an atexit hook.

The code of this function is:

class _MainThread(Thread):      def _exitfunc(self):         self._Thread__stop()         t = _pickSomeNonDaemonThread()         if t:             if __debug__:                 self._note("%s: waiting for other threads", self)         while t:             t.join()             t = _pickSomeNonDaemonThread()         if __debug__:             self._note("%s: exiting", self)         self._Thread__delete() 

This function will be called by the Python interpreter when sys.exit is called, or when the main thread terminates. When the function returns, the interpreter will call the system _exit function. And the function will terminate, when there are only daemon threads running (if any).

When the _exit function is called, the OS will terminate all of the process threads, and then terminate the process. The Python runtime will not call the _exit function until all the non-daemon thread are done.

All threads are part of the process.


My interpretation/understanding was: main thread terminates when all non-daemon threads are terminated.

So python daemon threads are not part of python program if "the entire Python program exits when only daemon threads are left"?

Your understanding is incorrect. For the OS, a process is composed of many threads, all of which are equal (there is nothing special about the main thread for the OS, except that the C runtime add a call to _exit at the end of the main function). And the OS doesn't know about daemon threads. This is purely a Python concept.

The Python interpreter uses native thread to implement Python thread, but has to remember the list of threads created. And using its atexit hook, it ensures that the _exit function returns to the OS only when the last non-daemon thread terminates. When using "the entire Python program", the documentation refers to the whole process.


The following program can help understand the difference between daemon thread and regular thread:

import sys import time import threading  class WorkerThread(threading.Thread):      def run(self):         while True:             print 'Working hard'             time.sleep(0.5)  def main(args):     use_daemon = False     for arg in args:         if arg == '--use_daemon':             use_daemon = True     worker = WorkerThread()     worker.setDaemon(use_daemon)     worker.start()     time.sleep(1)     sys.exit(0)  if __name__ == '__main__':     main(sys.argv[1:]) 

If you execute this program with the '--use_daemon', you will see that the program will only print a small number of Working hard lines. Without this flag, the program will not terminate even when the main thread finishes, and the program will print Working hard lines until it is killed.

like image 157
Sylvain Defresne Avatar answered Sep 20 '22 21:09

Sylvain Defresne


I'm not familiar with the implementation, so let's make an experiment:

import threading import time  def target():     while True:         print 'Thread working...'         time.sleep(5)  NUM_THREADS = 5  for i in range(NUM_THREADS):     thread = threading.Thread(target=target)     thread.start() 
  1. The number of threads reported using ps -o cmd,nlwp <pid> is NUM_THREADS+1 (one more for the main thread), so as long as the OS tools detect the number of threads, they should be OS threads. I tried both with cpython and jython and, despite in jython there are some other threads running, for each extra thread that I add, ps increments the thread count by one.

  2. I'm not sure about htop behaviour, but ps seems to be consistent.

  3. I added the following line before starting the threads:

    thread.daemon = True 

    When I executed the using cpython, the program terminated almost immediately and no process was found using ps, so my guess is that the program terminated together with the threads. In jython the program worked the same way (it didn't terminate), so maybe there are some other threads from the jvm that prevent the program from terminating or daemon threads aren't supported.

Note: I used Ubuntu 11.10 with python 2.7.2+ and jython 2.2.1 on java1.6.0_23

like image 43
jcollado Avatar answered Sep 20 '22 21:09

jcollado