Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Threads is not executing in parallel python with ThreadPoolExecutor

I'm new in python threading and I'm experimenting this: When I run something in threads (whenever I print outputs), it never seems to be running in parallel. Also, my functions take the same time that before using the library concurrent.futures (ThreadPoolExecutor). I have to calculate the gains of some attributes over a dataset (I cannot use libraries). Since I have about 1024 attributes and the function was taking about a minute to execute (and I have to use it in a for iteration) I dicided to split the array of attributes into 10 (just as an example) and run the separete function gain(attribute) separetly for each sub array. So I did the following (avoiding some extra unnecessary code):

def calculate_gains(self):
    splited_attributes = np.array_split(self.attributes, 10)
    result = {}
    for atts in splited_attributes:
        with concurrent.futures.ThreadPoolExecutor() as executor:
            future = executor.submit(self.calculate_gains_helper, atts)
            return_value = future.result()
            self.gains = {**self.gains, **return_value}

Here's the calculate_gains_helper:

def calculate_gains_helper(self, attributes):
    inter_result = {}
    for attribute in attributes:
        inter_result[attribute] = self.gain(attribute)
    return inter_result

Am I doing something wrong? I read some other older posts but I couldn't get any info. Thanks a lot for any help!

like image 495
Leandro D. Avatar asked Apr 10 '20 22:04

Leandro D.


People also ask

Do threads run in parallel in Python?

In fact, a Python process cannot run threads in parallel but it can run them concurrently through context switching during I/O bound operations. This limitation is actually enforced by GIL. The Python Global Interpreter Lock (GIL) prevents threads within the same process to be executed at the same time.

Is ThreadPoolExecutor thread-safe Python?

ThreadPoolExecutor Thread-Safety Although the ThreadPoolExecutor uses threads internally, you do not need to work with threads directly in order to execute tasks and get results. Nevertheless, when accessing resources or critical sections, thread-safety may be a concern.

How does ThreadPoolExecutor work in Python?

ThreadPoolExecutor. ThreadPoolExecutor is an Executor subclass that uses a pool of threads to execute calls asynchronously. An Executor subclass that uses a pool of at most max_workers threads to execute calls asynchronously.

How does concurrent futures ThreadPoolExecutor work?

ThreadPoolExecutor Methods : submit(fn, *args, **kwargs): It runs a callable or a method and returns a Future object representing the execution state of the method. map(fn, *iterables, timeout = None, chunksize = 1) : It maps the method and iterables together immediately and will raise an exception concurrent. futures.


1 Answers

Python threads do not run in parallel (at least in CPython implementation) because of the GIL. Use processes and ProcessPoolExecutor to really have parallelism

with concurrent.futures.ProcessPoolExecutor() as executor:
    ...
like image 97
michaeldel Avatar answered Sep 23 '22 20:09

michaeldel