I have a simple producer consumer pattern set up in part of my gui code. I'm attempting to profile just the specific consumer section to see if there's any chance for optimization. However, when attempting to run the code with python -m cProfile -o out.txt myscript.py
I'm getting an error thrown from Python's pickle
module.
File "<string>", line 1, in <module>
File "c:\python27\lib\multiprocessing\forking.py", line 374, in main
self = load(from_parent)
File "c:\python27\lib\pickle.py", line 1378, in load
return Unpickler(file).load()
File "c:\python27\lib\pickle.py", line 858, in load
dispatch[key](self)
File "c:\python27\lib\pickle.py", line 880, in load_eof
raise EOFError
EOFError
The basic pattern in the code is
class MyProcess(multiprocessing.Process):
def __init__(self, in_queue, msg_queue):
multiprocessing.Process.__init__(self)
self.in_queue = in_queue
self.ext_msg_queue = msg_queue
self.name == multiprocessing.current_process().name
def run(self):
## Do Stuff with the queued items
This is usually fed tasks from the GUI side of things, but for testing purposes, I set it up as follows.
if __name__ == '__main__':
queue = multiprocessing.Queue()
meg_queue = multiprocessing.Queue()
p = Grabber(queue)
p.daemon = True
p.start()
time.sleep(20)
p.join()
But upon trying start the script, I get the above error message.
Is there a way around the error?
Python multiprocessing Process classAt first, we need to write a function, that will be run by the process. Then, we need to instantiate a process object. If we create a process object, nothing will happen until we tell it to start processing via start() function. Then, the process will run and return its result.
Python's Global Interpreter Lock (GIL) only allows one thread to be run at a time under the interpreter, which means you can't enjoy the performance benefit of multithreading if the Python interpreter is required. This is what gives multiprocessing an upper hand over threading in Python.
Profiling is a technique to figure out how time is spent in a program. With these statistics, we can find the “hot spot” of a program and think about ways of improvement. Sometimes, a hot spot in an unexpected location may hint at a bug in the program as well.
In my understanding the cProfile
module doesn't play well with multiprocessing
on the command line. (See Python multiprocess profiling.)
One way to work around this is to run the profiler from within your code -- in particular, you'll need to set up a different profiler output file for each process in your pool. You can do this pretty easily by making a call to cProfile.runctx('a+b', globals(), locals(), 'profile-%s.out' % process_name)
.
http://docs.python.org/2/library/profile.html#module-cProfile
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With