I'm starting to learn Cython because of performance issues. This particular code is an attempt to implement some new algorithms in the transportation modeling (for planning) area.
I decided to start with a very simple function that I will use a LOT (hundreds of millions of times) and would definitely benefit from a performance increase.
I implemented this function in three different ways and tested them for the same parameter (for the sake of simplicity) for 10 million times each:
Cython code in a cython module. Running time: 3.35s
Python code in a Cython module. Running time: 4.88s
Python code on the main script. Running time: 2.98s
As you can see, the cython code was 45% slower than the python code in a cython module and 64% slower than the code written on the main script. How is that possible? Where am I making a mistake?
The cython code is this:
def BPR2(vol, cap, al, be):
con=al*pow(vol/cap,be)
return con
def func (float volume, float capacity,float alfa,float beta):
cdef float congest
congest=alfa*pow(volume/capacity,beta)
return congest
And the script for testing is this:
agora=clock()
for i in range(10000000):
q=linkdelay.BPR2(10,5,0.15,4)
agora=clock()-agora
print agora
agora=clock()
for i in range(10000000):
q=linkdelay.func(10,5,0.15,4)
agora=clock()-agora
print agora
agora=clock()
for i in range(10000000):
q=0.15*pow(10/5,4)
agora=clock()-agora
print agora
I'm aware of issues like transcendental functions (power) being slower, but I dont think it should be a problem.
Since there is an overhead for looking for the function on the function space, would it help the performance if I passed an array for the function and got an array back? Can I return an array using a function written in Cython?
For reference, I'm using:
Calling the Cython function is faster than calling a Python function call, it's true. But even 30 nanoseconds is rather slow by the standards of compiled languages: for comparison, a C function called by another C function might take only 3 nanoseconds, or much less if it gets inlined.
In summary: code is slowed down by the compilation and interpretation that occurs during runtime. Compare this to a statically typed, compiled language which runs just the CPU instructions once compilated. It's actually possible to extend Python with compiled modules that are written in C.
Cython allows native C functions, which have less overhead than Python functions when they are called, and therefore execute faster.
Note that regular Python takes more than 500 seconds for executing the above code while Cython just takes around 1 second. Thus, Cython is 500x times faster than Python for summing 1 billion numbers.
Testing was done using :
for i in range(10000000):
func(2.7,2.3,2.4,i)
Here are the results:
cdef float func(float v, float c, float a, float b):
return a * (v/c) ** b
#=> 0.85
cpdef float func(float v, float c, float a, float b):
return a * (v/c) ** b
#=> 0.84
def func(v,c,a,b):
return a * pow(v/c,b)
#=> 3.41
cdef float func(float v, float c, float a, float b):
return a * pow(v/c, b)
#=> 2.35
For highest efficiency you need to define the function in C and make the return type static.
This function could be optimized as such (in both python and cython, removing the intermediate variable is faster):
def func(float volume, float capacity, float alfa,f loat beta):
return alfa * pow(volume / capacity, beta)
When Cython is slower, it's probably due to type conversions, and possibly exacerbated by a lack of type annotations. Also, if you use C datastructures in Cython, that'll tend to be faster than using Python datastructures in Cython.
I did a performance comparison between CPython 2.x (with and without Cython, with and without psyco), CPython 3.x (with and without Cython), Pypy, and Jython. Pypy was by far the fastest, at least for the micro-benchmark examined: http://stromberg.dnsalias.org/~strombrg/backshift/documentation/performance/
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With