Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

assert self._state in (CLOSE, TERMINATE) when using python multiprocess

I am currently trying to use python multiprocessing. The library I use is multiprocess (NOT multiprocessing).

I have the following code, which creates a number of computing jobs, and runs it through a map operation:

pool = multiprocess.Pool(4)
all_responses = pool.map_async(wrapper_singlerun, range(10000))
pool.join()
pool.close()

However, whenever I run this snippet of code, I get the following error:

    pool.join()
  File "/Users/davidal/miniconda3/lib/python3.6/site-packages/multiprocess/pool.py", line 509, in join
    assert self._state in (CLOSE, TERMINATE)
AssertionError

Do you have any idea why this error happens? I used pool.map_async before, but figured that I need to have a pool rendez-vous command. Otherwise, my PC created something like a forkbomb, which created too many threads (at least, that's what I think it does...)

Any ideas are appreciated!

like image 651
DaveTheAl Avatar asked May 29 '18 21:05

DaveTheAl


Video Answer


1 Answers

The problem is that you're calling join before close.

multiprocess appears to be missing its documentation, but, as far as I can tell, it's basically a fork of the stdlib multiprocessing that pre-monkeypatches dill in for pickle, so the multiprocessing docs should be relevant here. (Also, in a comment, you said that you can repro the problem with multiprocessing.)

So, Pool.join says:

Wait for the worker processes to exit. One must call close() or terminate() before using join().

The close method is how you shut down the send side of the queue so new tasks can't be added. The join method is how you wait for everything on the queue to be processed. Waiting for the queue to drain before closing it wouldn't work.

But you're calling close after join, instead of before. And the first thing join does is assert that you've already called close or terminate, which you haven't, hence the assertion failure.

So, you probably just want to switch the order of those two calls.

Or, alternatively, maybe you were confused about what join is for, and thought you needed to call it before you could use all_responses.get() or .wait(). If so—you don't need to do that; the get will block until the results are available, after which you don't need a join. This is actually more common, especially with map and friends (although the examples in the docs do it via a with Pool(…) as pool: instead of manually calling anything on the pool).

like image 160
abarnert Avatar answered Sep 26 '22 02:09

abarnert