Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Line Profiling inner function with Cython

I've had pretty good success using this answer to profile my Cython code, but it doesn't seem to work properly with nested functions. In this notebook you can see that the profile doesn't appear when the line profiler is used on a nested function. Is there a way to get this to work?

like image 985
C_Z_ Avatar asked Jun 28 '16 21:06

C_Z_


1 Answers

tl,dr:

This is seems to be an issue with Cython, there's a hackish way that does the trick but isn't reliable, you could use it for one-off cases until this issue has been fixed*

Change the line_profiler source:

I can't be 100% sure for this but it is working, what you need to do is download the source for line_profiler and go fiddle around in python_trace_callback. After the code object is obtained from the current frame of execution (code = <object>py_frame.f_code), add the following:

if what == PyTrace_LINE or what == PyTrace_RETURN:
    code = <object>py_frame.f_code

    # Add entry for code object with different address if and only if it doesn't already
    # exist **but** the name of the function is in the code_map
    if code not in self.code_map and code.co_name in {co.co_name for co in self.code_map}:
        for co in self.code_map:
            # make condition as strict as necessary
            cond = co.co_name == code.co_name and co.co_code == code.co_code
            if cond:
                del self.code_map[co]
                self.code_map[code] = {}

This will replace the code object in self.code_map with the one currently executing that matches its name and co.co_code contents. co.co_code is b'' for Cython, so in essence in matches Cython functions with that name. Here is where it can become more robust and match more attributes of a code object (for example, the filename).

You can then procceed to build it with python setup.py build_ext and install with sudo python setup.py install. I'm currently building it with python setup.py build_ext --inplace in order to work with it locally, I'd suggest you do too. If you do build it with --inplace make sure you navigate to the folder containing the source for line_profiler before importing it.

So, in the folder containing the built shared library for line_profiler I set up a cyclosure.pyx file containing your functions:

def outer_func(int n):
    def inner_func(int c):
        cdef int i
        for i in range(n):
             c+=i
        return c
    return inner_func

And an equivalent setup_cyclosure.py script in order to build it:

from distutils.core import setup
from distutils.extension import Extension
from Cython.Build import cythonize
from Cython.Compiler.Options import directive_defaults

directive_defaults['binding'] = True
directive_defaults['linetrace'] = True

extensions = [Extension("cyclosure", ["cyclosure.pyx"], define_macros=[('CYTHON_TRACE', '1')])]
setup(name = 'Testing', ext_modules = cythonize(extensions))

As previously, the build was performed with python setup_cyclosure.py build_ext --inplace.

Launching your interpreter from the current folder and issuing the following yields the wanted results:

>>> import line_profiler
>>> from cyclosure import outer_func
>>> f = outer_func(5)
>>> prof = line_profiler.LineProfiler(f)
>>> prof.runcall(f, 5)

15
>>> prof.print_stats()
Timer unit: 1e-06 s

Total time: 1.2e-05 s
File: cyclosure.pyx

Function: inner_func at line 2

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
     2                                               def inner_func(int c):
     3                                                   cdef int i
     4         1            5      5.0     41.7          for i in range(n):
     5         5            6      1.2     50.0               c+=i
     6         1            1      1.0      8.3          return c  

Issue with IPython %%cython:

Trying to run this from IPython results in an unfortunate situation. While executing, the code object doesn't store the path to the file where it was defined, it simply stored the filename. Since I simply drop the code object into the self.code_map dictionary and since code objects have read-only Attributes, we lose the file path information when using it from IPython (because it stores the files generated from %%cython in a temporary directory).

Because of that, you do get the profiling statistics for your code but you get no contents for the contents. One might be able to forcefully copy the filenames between the two code objects in question but that's another issue altogether.

*The Issue:

The issue here is that for some reason, when dealing with nested and/or enclosed functions, there's an abnormality with the address of the code object when it is created and while it is being interpreted in one of Pythons frames. The issue you were facing was caused by the following condition not being satisfied:

    if code in self.code_map:

Which was odd. Creating your function in IPython and adding it to the LineProfiler did indeed add it to the self.code_map dictionary:

prof = line_profiler.LineProfiler(f)

prof.code_map
Out[16]: {<code object inner_func at 0x7f5c65418f60, file "/home/jim/.cache/ipython/cython/_cython_magic_1b89b9cdda195f485ebb96a104617e9c.pyx", line 2>: {}}

When the time came to actually test the previous condition though, and the current code object was snatched from the current execution frame with code = <object>py_frame.f_code, the address of the code object was different:

 # this was obtained with a basic print(code) in _line_profiler.pyx
 code object inner_func at 0x7f7a54e26150

indicating it was re-created. This only happens with Cython and when a function is defined inside another function. Either this or something that I am completely missing.

like image 150
Dimitris Fasarakis Hilliard Avatar answered Oct 17 '22 16:10

Dimitris Fasarakis Hilliard