I am running a piece of code using a multiprocessing pool. The code works on a data set and fails on another one. Clearly the issue is data driven - Having said that I am not clear where to begin troubleshooting as the error I receive is the following. Any hints for a starting point would be most helpful. Both sets of data are prepared using the same code - so I don't expect there to be a difference - yet here I am.
Also see comment from Robert - we differ on os, and python version 3.6 (I have 3.4, he has 3.6) and quite different data sets. Yet error is identical down to the lines in the python code.
My suspicions:
there is some period of time after which the process literally collects - finds the process is not over and gives up.
Exception in thread Thread-9:
Traceback (most recent call last):
File "C:\Program Files\Python\WinPython-64bit-3.4.4.4Qt5\python-3.4.4.amd64\lib\threading.py", line 911, in _bootstrap_inner self.run()
File "C:\Program Files\Python\WinPython-64bit-3.4.4.4Qt5\python-3.4.4.amd64\lib\threading.py", line 859, in run self._target(*self._args, **self._kwargs)
File "C:\Program Files\Python\WinPython-64bit-3.4.4.4Qt5\python-3.4.4.amd64\lib\multiprocessing\pool.py", line 429, in _handle_results task = get()
File "C:\Program Files\Python\WinPython-64bit-3.4.4.4Qt5\python-3.4.4.amd64\lib\multiprocessing\connection.py", line 251, in recv return ForkingPickler.loads(buf.getbuffer())
TypeError: init() missing 1 required positional argument: 'message'
I think the issue is that langdetect quietly declares a hidden global detector factory here https://github.com/Mimino666/langdetect/blob/master/langdetect/detector_factory.py#L120:
def init_factory():
global _factory
if _factory is None:
_factory = DetectorFactory()
_factory.load_profile(PROFILES_DIRECTORY)
def detect(text):
init_factory()
detector = _factory.create()
detector.append(text)
return detector.detect()
def detect_langs(text):
init_factory()
detector = _factory.create()
detector.append(text)
return detector.get_probabilities()
This kind of thing can cause issues in multiprocessing, in my experience, by running afoul of the way that multiprocessing attempts to share resources in memory across processes and manages namespaces in workers and the master process, though the exact mechanism in this case is a black box to me. I fixed it by adding a call to init_factory
function to my pool initialization function:
from langdetect.detector_factory import init_factory
def worker_init_corpus(stops_in):
global sess
global stops
sess = requests.Session()
sess.mount("http://", HTTPAdapter(max_retries=10))
stops = stops_in
signal.signal(signal.SIGINT, signal.SIG_IGN)
init_factory()
FYI: The "sess" logic is to provide each worker with an http connection pool for requests, for similar issues when using that module with multiprocessing pools. If you don't do this, the workers do all their http communication back up through the parent process because that's where the hidden global http connection pool is by default, and then everything is painfully slow. This is one of the issues I've run into that made me suspect a similar cause here.
Also, to further reduce potential confusion: stops
is for providing the stopword list I'm using to the mapped function. And the signal
call is to force pools to exit nicely when hit with a user interrupt (ctrl-c). Otherwise they often get orphaned and just keep on chugging along after the parent process dies.
Then my pool is initialized like this:
self.pool = mp.Pool(mp.cpu_count()-2, worker_init_corpus, (self.stops,))
I also wrapped my call to detect
in a try/catch LangDetectExeception
block:
try:
posting_out["lang"] = detect(posting_out["job_description"])
except LangDetectException:
posting_out["lang"] = "none"
But this doesn't fix it on its own. Pretty confident that the the initialization is the fix.
Thanks to Robert - focusing on lang detect yielded the fact that possibly one of my text entries were empty
LangDetectException: No features in text
rookie mistake - possibly due to encoding errors- re-running after filtering those out - will keep you (Robert) posted.
I was throwing a custom exception somewhere in the code, and it was being thrown in most of my processes (in the pool). About 90% of my processes went to sleep because this exception occurred in them. But, instead of getting a normal traceback, I get this cryptic error. Mine was on Linux, though.
To debug this, I removed the pool and ran the code sequentially.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With