Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it possible to set maxtasksperchild for a threadpool?

After encountering some probable memory leaks in a long running multi threaded script I found out about maxtasksperchild, which can be used in a Multi process pool like this:

import multiprocessing

with multiprocessing.Pool(processes=32, maxtasksperchild=x) as pool:
    pool.imap(function,stuff)

Is something similar possible for the Threadpool (multiprocessing.pool.ThreadPool)?

like image 507
Fabian Bosler Avatar asked May 16 '19 12:05

Fabian Bosler


1 Answers

As the answer by noxdafox said, there is no way in the parent class, you can use threading module to control the max number of tasks per child. As you want to use multiprocessing.pool.ThreadPool, threading module is similar, so...

def split_processing(yourlist, num_splits=4):
    '''
    yourlist = list which you want to pass to function for threading.
    num_splits = control total units passed.
    '''
    split_size = len(yourlist) // num_splits
    threads = []
    for i in range(num_splits):
        start = i * split_size
        end = len(yourlist) if i+1 == num_splits else (i+1) * split_size
        threads.append(threading.Thread(target=function, args=(yourlist, start, end)))
        threads[-1].start()

    # wait for all threads to finish
    for t in threads:
        t.join()

Lets say yourlist has 100 items, then

if num_splits = 10; then threads = 10, each thread has 10 tasks.
if num_splits = 5; then threads = 5, each thread has 20 tasks.
if num_splits = 50; then threads = 50, each thread has 2 tasks.
and vice versa.
like image 152
ASHu2 Avatar answered Oct 19 '22 07:10

ASHu2