If I have a generator function in Python, say:
def gen(x):
for i in range(x):
yield(i ** 2)
How do I declare that the output data type is int
in Cython? Is it even worth while?
Thanks.
Edit: I read mentions of (async) generators being implemented in the changelog: http://cython.readthedocs.io/en/latest/src/changes.html?highlight=generators#id23
However there is no documentation about how to use them. Is it because they are supported but there is no particular advantage in using them with Cython or no optimization possible?
No, there is no way to do this in Cython.
When you look at the Cython-produced code, you will see that gen
(and other generator-functions) returns a generator, which is basically a __pyx_CoroutineObject
object, which looks as follows:
typedef PyObject *(*__pyx_coroutine_body_t)(PyObject *, PyThreadState *, PyObject *);
typedef struct {
PyObject_HEAD
__pyx_coroutine_body_t body;
PyObject *closure;
...
int resume_label;
char is_running;
} __pyx_CoroutineObject;
The most important part is the body
-member: this is the function which does the actual calculation. As we can see it returns a PyObject
and there is no way (yet?) for it to be adapted to int
, double
or similar.
As for the reasons why it is not done, I can only speculate - but there are probably more than just one reason.
If you really care about performance, generators introduce too much overhead anyway (for example yield
is not possible in cdef
-functions) and should be refactored into something simpler.
To elaborate more about possible refactorings. As baseline let's assume we would like to sum up all created values:
%%cython
def gen(int x):
cdef int i
for i in range(x):
yield(i ** 2)
def sum_it(int n):
cdef int i
cdef int res=0
for i in gen(n):
res+=i
return res
Timing it leads to:
>>> %timeit sum_it(1000)
28.9 µs ± 1.06 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
The good news: it is about 10 times faster than the pure python version, but if we are really after the speed:
%%cython
cdef int gen_fast(int i):
return i ** 2
def sum_it_fast(int n):
cdef int i
cdef int res=0
for i in range(n):
res+=gen_fast(i)
return res
It is:
>>> %timeit sum_it_fast(1000)
661 ns ± 20.7 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
about 50 times faster.
I understand, that is quite a change and might be pretty hard to do - I would do it only if it is really the bottle-neck of my program - but then speed-up 50 would be a real motivation to do it.
Obviously there are a lot of others approaches: using numpy-arrays or array.array
instead of generators or writing a custom generator (cdef-class) which would offer an additional fast/efficient possibility to get the int
-values and not PyObjects
- but this all depends on your scenario at hand. I just wanted to show that there is potential to improve the performance by ditching the generators.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With