Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

joblib: Worker stopped caused by timeout or memory leak

I am only using the basic joblib functionality:

Parallel(n_jobs=-1)(delayed(function)(arg) for arg in arglist)

I am frequently getting the warning:

UserWarning: A worker stopped while some jobs were given to the executor. This can be caused by a too short worker timeout or by a memory leak.

This tells me that one possible cause is a too short worker timeout. Since I did not set a worker timeout and default is None, this cannot be the issue. How do I go about finding a memory leak? Or is there something I can do to avoid this warning? Did some parts not get executed? Or should I just not worry about this?

like image 692
cmosig Avatar asked May 25 '20 12:05

cmosig


People also ask

What is delayed in Joblib?

The delayed function is a simple trick to be able to create a tuple (function, args, kwargs) with a function-call syntax. Warning. Under Windows, the use of multiprocessing. Pool requires to protect the main loop of code to avoid recursive spawning of subprocesses when using joblib.

How does Python define Joblib?

Joblib is a set of tools to provide lightweight pipelining in Python. In particular: transparent disk-caching of functions and lazy re-evaluation (memoize pattern) easy simple parallel computing.

Why is Joblib used?

Joblib provides a better way to avoid recomputing the same function repetitively saving a lot of time and computational cost. For example, let's take a simple example below: As seen above, the function is simply computing the square of a number over a range provided. It takes ~20 s to get the result.


1 Answers

To fix, increase timeout, I used this:

# Increase timeout (tune this number to suit your use case).
timeout=99999
result_chunks = joblib.Parallel(n_jobs=njobs, timeout=timeout)(joblib.delayed(f_chunk)(i) for i in n_chunks)

Note that this warning is benign; joblib will recover and results are complete and accurate.

See a more detailed answer here.

like image 150
Contango Avatar answered Sep 20 '22 02:09

Contango