The Python community has published helpful reference material showing how to profile Python code, and the technical details of Python extensions in C or in Cython. I am still searching for tutorials which show, however, for non-trivial Python programs, the following:
A good tutorial would provide the reader with a methodology on how to reason through the problem of optimization by working through a complete example. I have had no success finding such a resource.
Do you know of (or have you written) such a tutorial?
For clarification, I'm not interested in tutorials that cover only the following:
Points 1 and 2 are just basic optimization rule of thumbs. I would be very astonished if there was anywhere the kind of tutorial you are looking for. Maybe that's why you haven't found one. My short list:
Just start by profiling your python code with usual python tools. Find where you code need to be optimized. Then try to optimize it sticking with python. If it is still too slow, try to understand why. If it's IO bound it is unlikely a C program would be better. If the problem come from the algorithm it is also unlikely C would perform better. Really the "good" cases where C could help are quite rare, runtime should not be too far away from what you want (like a 2 of 3 times speedup) data structure are simples and would benefit from a low level representation and you really, really need that speedup. In most other cases using C instead of python will be an unrewarding job.
Really it is quite rare calling C code from python is done with performance in mind as a primary goal. More often the goal is to interface python with some existing C code.
And as another other poster said, you would probably be better advised of using cython.
If you still want to write a C module for Python, all necessary is in the official documentation.
O'Reilly has a tutorial (freely available as far as I can tell, I was able to read the whole thing) that illustrates how to profile a real project (they use an EDI parsing project as a subject for profiling) and identify hotspots. There's not too much detail on writing the C extension that will fix the bottleneck in the O'Reilly article. It does, however, cover the first two things that you want with a non-trivial example.
The process of writing C extensions is fairly well documented here. The hard part is coming up with ways to replicate what Python code is doing in C, and that takes something that would be hard to teach in a tutorial: ingenuity, knowledge of algorithms, hardware, and efficiency, and considerable C skill.
Hope this helps.
For points 1 and 2, I would use a Python profiler, for example cProfile. See here for a quick tutorial.
If you've got an already existing python program, for point 3 you might want to consider using Cython. Of course, rather than re-writing in C, you may be able to think up an algorithmic improvement that will increase execution speed.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With