Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Profiling Python code that uses multiprocessing?

I have a simple producer consumer pattern set up in part of my gui code. I'm attempting to profile just the specific consumer section to see if there's any chance for optimization. However, when attempting to run the code with python -m cProfile -o out.txt myscript.py I'm getting an error thrown from Python's pickle module.

  File "<string>", line 1, in <module>
  File "c:\python27\lib\multiprocessing\forking.py", line 374, in main
    self = load(from_parent)
  File "c:\python27\lib\pickle.py", line 1378, in load
    return Unpickler(file).load()
  File "c:\python27\lib\pickle.py", line 858, in load
    dispatch[key](self)
  File "c:\python27\lib\pickle.py", line 880, in load_eof
    raise EOFError
EOFError

The basic pattern in the code is

class MyProcess(multiprocessing.Process):
    def __init__(self, in_queue, msg_queue):
        multiprocessing.Process.__init__(self)
        self.in_queue = in_queue
        self.ext_msg_queue = msg_queue
        self.name == multiprocessing.current_process().name

    def run(self):
        ## Do Stuff with the queued items

This is usually fed tasks from the GUI side of things, but for testing purposes, I set it up as follows.

if __name__ == '__main__':

    queue = multiprocessing.Queue()
    meg_queue = multiprocessing.Queue()
    p = Grabber(queue)
    p.daemon = True
    p.start()
    time.sleep(20)
    p.join()

But upon trying start the script, I get the above error message.

Is there a way around the error?

like image 894
Zack Yoshyaro Avatar asked Aug 09 '13 15:08

Zack Yoshyaro


People also ask

How do you write a multiprocessing code in Python?

Python multiprocessing Process classAt first, we need to write a function, that will be run by the process. Then, we need to instantiate a process object. If we create a process object, nothing will happen until we tell it to start processing via start() function. Then, the process will run and return its result.

Can we use multiprocessing in Python?

Python's Global Interpreter Lock (GIL) only allows one thread to be run at a time under the interpreter, which means you can't enjoy the performance benefit of multithreading if the Python interpreter is required. This is what gives multiprocessing an upper hand over threading in Python.

What is profiling of Python code?

Profiling is a technique to figure out how time is spent in a program. With these statistics, we can find the “hot spot” of a program and think about ways of improvement. Sometimes, a hot spot in an unexpected location may hint at a bug in the program as well.


1 Answers

In my understanding the cProfile module doesn't play well with multiprocessing on the command line. (See Python multiprocess profiling.)

One way to work around this is to run the profiler from within your code -- in particular, you'll need to set up a different profiler output file for each process in your pool. You can do this pretty easily by making a call to cProfile.runctx('a+b', globals(), locals(), 'profile-%s.out' % process_name).

http://docs.python.org/2/library/profile.html#module-cProfile

like image 152
lmjohns3 Avatar answered Sep 17 '22 22:09

lmjohns3