Jupyter notebook never finishes processing using multiprocessing (Python 3)

Jupyter Notebook

I am using multiprocessing module basically, I am still learning the capabilities of multiprocessing. I am using the book by Dusty Phillips and this code belongs to it.

import multiprocessing   import random from multiprocessing.pool import Pool  def prime_factor(value):     factors = []     for divisor in range(2, value-1):         quotient, remainder = divmod(value, divisor)         if not remainder:             factors.extend(prime_factor(divisor))             factors.extend(prime_factor(quotient))             break         else:             factors = [value]     return factors  if __name__ == '__main__':     pool = Pool()     to_factor = [ random.randint(100000, 50000000) for i in range(20)]     results = pool.map(prime_factor, to_factor)     for value, factors in zip(to_factor, results):         print("The factors of {} are {}".format(value, factors))

On the Windows PowerShell (not on jupyter notebook) I see the following

Process SpawnPoolWorker-5: Process SpawnPoolWorker-1: AttributeError: Can't get attribute 'prime_factor' on <module '__main__' (built-in)>

I do not know why the cell never ends running?

570

asked Nov 15 '17 17:11

rsc05

2 Answers

It seems that the problem in Jupyter notebook as in different ide is the design feature. Therefore, we have to write the function (prime_factor) into a different file and import the module. Furthermore, we have to take care of the adjustments. For example, in my case, I have coded the function into a file known as defs.py

def prime_factor(value):     factors = []     for divisor in range(2, value-1):         quotient, remainder = divmod(value, divisor)         if not remainder:             factors.extend(prime_factor(divisor))             factors.extend(prime_factor(quotient))             break         else:             factors = [value]     return factors

Then in the jupyter notebook I wrote the following lines

import multiprocessing   import random from multiprocessing import Pool import defs    if __name__ == '__main__':     pool = Pool()     to_factor = [ random.randint(100000, 50000000) for i in range(20)]     results = pool.map(defs.prime_factor, to_factor)     for value, factors in zip(to_factor, results):         print("The factors of {} are {}".format(value, factors))

This solved my problem

enter image description here

answered Sep 20 '22 14:09

rsc05

To execute a function without having to write it into a separated file manually:

We can dynamically write the task to process into a temporary file, import it and execute the function.

from multiprocessing import Pool from functools import partial import inspect  def parallal_task(func, iterable, *params):      with open(f'./tmp_func.py', 'w') as file:         file.write(inspect.getsource(func).replace(func.__name__, "task"))      from tmp_func import task      if __name__ == '__main__':         func = partial(task, params)         pool = Pool(processes=8)         res = pool.map(func, iterable)         pool.close()         return res     else:         raise "Not in Jupyter Notebook"

We can then simply call it in a notebook cell like this:

def long_running_task(params, id):     # Heavy job here     return params, id  data_list = range(8)  for res in parallal_task(long_running_task, data_list, "a", 1, "b"):     print(res)

Ouput:

('a', 1, 'b') 0 ('a', 1, 'b') 1 ('a', 1, 'b') 2 ('a', 1, 'b') 3 ('a', 1, 'b') 4 ('a', 1, 'b') 5 ('a', 1, 'b') 6 ('a', 1, 'b') 7

Note: If you're using Anaconda and if you want to see the progress of the heavy task, you can use print() inside long_running_task(). The content of the print will be displayed in the Anaconda Prompt console.

answered Sep 18 '22 14:09

H4dr1en

Related questions
                            
                                ImportError: No module named 'xlrd'
                            
                                pandas equivalent of R's cbind (concatenate/stack vectors vertically)
                            
                                Python task queue alternatives and frameworks [closed]
                            
                                Feature names from OneHotEncoder
                            
                                Sound generation / synthesis with python?
                            
                                chr() equivalent returning a bytes object, in py3k
                            
                                Python 3 dataclass initialization
                            
                                Convert from '_io.BytesIO' to a bytes-like object in python3.6?
                            
                                Python 3 Building an array of bytes
                            
                                django.core.exceptions.AppRegistryNotReady: Apps aren't loaded yet. (django 2.0.1)(Python 3.6)
                            
                                Python range() with negative strides
                            
                                df.unique() on whole DataFrame based on a column
                            
                                Taking the floor of a float
                            
                                Error Message "Xcode alone is not sufficient on Sierra"
                            
                                How to make python3 command run Python 3.6 instead of 3.5?
                            
                                tf.data.Dataset: how to get the dataset size (number of elements in a epoch)?
                            
                                Which tkinter modules were renamed in Python 3?
                            
                                Seeking from end of file throwing unsupported exception
                            
                                Please explain "Task was destroyed but it is pending!"
                            
                                Python 3: starred expression to unpack a list

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Jupyter notebook never finishes processing using multiprocessing (Python 3)

Tags:

python-3.x

debugging

multiprocessing

jupyter

Jupyter Notebook

rsc05

People also ask

2 Answers

rsc05

H4dr1en

Recent Activity

Donate For Us