Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can threads create sub-threads in Python?

I am familiar with the syntax for creating threads in python.

from threading import Thread
from queue import Queue

task_queue = Queue(maxsize=0)    

num_threads=10
for i in range(num_threads):
    thread = Thread(target=work, args=(task_queue,))
    thread.start()

task_queue.join()

My question is weather it is ok to open up new threads 'inside' other threads like so:

def work(task_queue):
    task = task_queue.get()

    subtasks = task.get_sub_tasks()

    for subtask in subtasks:
        thread = Thread(target=sub_work, args(subtask,))
        thread.start()

So

  1. Is this structure ok? or is it to messy to do it this way?

  2. If this is ok, are the sub-thread processes subordinated to the thread that generated it, or do they become children of the parent python process? If the thread that created the sub-thread "dies" with an error, what happens to the sub-thread?

I realize python threads are subject to the interpreter global lock, but my application involves access to a server, so the multi threading is to avoid serialized connections which would take too long.

like image 976
Guilherme de Lazari Avatar asked Jul 04 '17 14:07

Guilherme de Lazari


1 Answers

So regarding your questions:

  • Q1: It is not a problem to start "subthreads" from a thread
  • Q2: It is actually an interesting question, my instinct would say "no", but getting a proof sounds better to me

So I created a quick test as below (I would use a gist but I can't access such things from where I am):

from threading import Thread
import time

def sub_worker(id):
    print("SubWorker started from thread", id)
    while True:
        print("Subworking...")
        time.sleep(5)
def worker(id):
    print("Worker started from thread", id)
    count = 1
    while count < 5:
        print("Working...")
        tmp_thread = Thread(target=sub_worker, args=[count])
        tmp_thread.start()
        count +=1
        time.sleep(1)
    raise EnvironmentError("Tired of working")

main = Thread(target=worker, args=[0])

main.start()

Which gives us the output (as expected an error in the parent thread does not stop the "children"):

Worker started from thread 0
Working...
SubWorker started from thread 1
Subworking...
Working...
SubWorker started from thread 2
Subworking...
Working...
SubWorker started from thread 3
Subworking...
Working...
SubWorker started from thread 4
Subworking...
Exception in thread Thread-1:
Traceback (most recent call last):
  File "C:\Temp\tt\Tools\Anaconda3.4.3.1\lib\threading.py", line 916, in _bootstrap_inner
    self.run()
  File "C:\Temp\tt\Tools\Anaconda3.4.3.1\lib\threading.py", line 864, in run
    self._target(*self._args, **self._kwargs)
  File "C:/Temp/tt/Tools/PyCharmWorkspace/xml_parse/test.py", line 18, in worker
    raise EnvironmentError("Tired of working")
OSError: Tired of working

Subworking...
Subworking...
Subworking...
Subworking...
Subworking...
Subworking...
Subworking...

I think that htop shows this hierarchy might be due to the fact that threads are treated as processes by the Linux kernel. And since a call to fork is made it can shows this hierarchy. With the concept of threads, I do not believe that a hierarchy makes so much sense as each of them will share the same resources (memory, file descriptors ...etc)

like image 131
Adonis Avatar answered Oct 05 '22 06:10

Adonis