When writing a Cython wrapper for a C++ library, I've encountered a case where it's not clear how to correctly decide when to delete certain C++ instances.
The C++ library looks something like this:
#include <stdio.h>
#include <string.h>
class Widget {
char *name;
public:
Widget() : name(strdup("a widget")) {}
~Widget() { printf("Widget destruct\n"); }
void foo() { printf("Widget::foo %s\n", this->name); }
};
class Sprocket {
private:
Widget *important;
public:
Sprocket(Widget* important) : important(important) {}
~Sprocket() { important->foo(); }
};
An important aspect of this library is that the Sprocket
destructor uses the Widget*
it was given, so the Widget
must not be destroyed until after the Sprocket
has been.
The Cython wrapper I've written looks like this:
cdef extern from "somelib.h":
cdef cppclass Widget:
pass
cdef cppclass Sprocket:
Sprocket(Widget*)
cdef class PyWidget:
cdef Widget *thisptr
def __init__(self):
self.thisptr = new Widget()
def __dealloc__(self):
print 'PyWidget dealloc'
del self.thisptr
cdef class PySprocket:
cdef PyWidget widget
cdef Sprocket *thisptr
def __init__(self, PyWidget widget):
self.widget = widget
self.thisptr = new Sprocket(self.widget.thisptr)
def __dealloc__(self):
print 'PySprocket dealloc with widget', self.widget
del self.thisptr
After building the Python build like this:
$ cython --cplus somelib.pyx
$ g++ -I/usr/include/python2.6 -L/usr/lib somelib.cpp -shared -o somelib.so
$
In the trivial case, it appears to work:
$ python -c 'from somelib import PyWidget, PySprocket
spr = PySprocket(PyWidget())
del spr
'
PySprocket dealloc with widget <somelib.PyWidget object at 0xb7537080>
Widget::foo a widget
PyWidget dealloc
Widget destruct
$
The cdef Widget
field keeps the PyWidget
alive until after PySprocket.__dealloc__
destroys the Sprocket
. However, as soon as the Python garbage collected gets involved, the tp_clear
function Cython constructs for PySprocket
messes this up:
$ python -c 'from somelib import PyWidget, PySprocket
class BadWidget(PyWidget):
pass
widget = BadWidget()
sprocket = PySprocket(widget)
widget.cycle = sprocket
del widget
del sprocket
'
PyWidget dealloc
Widget destruct
PySprocket dealloc with widget None
Widget::foo ��h�
Since there's a reference cycle, the garbage collector invokes the tp_clear
to try to break the cycle. Cython's tp_clear
drops all references to Python objects. Only after this happens does PySprocket.__dealloc__
get to run.
Cython documentation warns about __dealloc__
(although it took me a while to learn what conditions it was talking about, since it doesn't go into any detail). So perhaps this approach is entirely invalid.
Can Cython support this use case?
As (what I hope is) a temporary work-around, I've moved to an approach that looks something like this:
cdef class PySprocket:
cdef void *widget
cdef Sprocket *thisptr
def __init__(self, PyWidget widget):
Py_INCREF(widget)
self.widget = <void*>widget
self.thisptr = new Sprocket(self.widget.thisptr)
def __dealloc__(self):
del self.thisptr
Py_DECREF(<object>self.widget)
In other words, hiding the reference from Cython so that it is still valid in __dealloc__
, and doing reference counting on it manually.
Cython can call into both C and C++ code, and even subclass C++ classes. The C++ support is somewhat limited, though, given how complex the C++ language is.
Cython has native support for most of the C++ language. Specifically: C++ objects can be dynamically allocated with new and del keywords. C++ objects can be stack-allocated.
Cython is fast at the same time provides flexibility of being object-oriented, functional, and dynamic programming language. One of the key aspects of Cython include optional static type declarations which comes out of the box.
There are two kinds of function definition in Cython: Python functions are defined using the def statement, as in Python. They take Python objects as parameters and return Python objects. C functions are defined using the cdef statement in Cython syntax or with the @cfunc decorator.
cdef extern from "somelib.h":
cdef cppclass Widget:
pass
cdef cppclass Sprocket:
Sprocket(Widget*)
cdef class PyWidget:
cdef Widget *thisptr
cdef set sprockets
def __init__(self):
self.thisptr = new Widget()
self.sprockets = set()
def __dealloc__(self):
print 'PyWidget dealloc'
#PyWidget knows the sprockets and notifies them on destroy
sprockets_to_dealloc = self.sprockets.copy()
#with this solution spr items can call back to detach
for spr in sprockets_to_dealloc:
del spr
del self.thisptr
def attach(PySprocket spr):
print 'PySprocket attach'
self.sprockets.add(spr)
def detach(PySprocket spr):
print 'PySprocket detach'
self.sprockets.remove(spr)
cdef class PySprocket:
cdef PyWidget widget
cdef Sprocket *thisptr
def __init__(self, PyWidget widget):
self.thisptr = new Sprocket(widget.thisptr)
#You should be sure here that the widget exists
widget.attach(self)
self.widget = widget
def __dealloc__(self):
self.widget.detach(self)
del self.thisptr
I come back a bit later to check what I have wrote, cause I'm quite tired, but here is what matters: The point is that you want to notify Sprockets when destroying Widget, and vice versa.
It is a general solution, can be tuned up.
You have to include error handling also, I have skipped that absolutely. Nothing to do with garbage collector, there was a design problem in your code.
EDIT:
these codes are equialent:
A
class BadWidget(PyWidget):
pass
widget = BadWidget()
sprocket = PySprocket(widget)
widget.cycle = sprocket ###1
del widget ###2
del sprocket
B
class BadWidget(PyWidget):
pass
widget = BadWidget()
sprocket = PySprocket(widget)
sprocket.widget.cycle = sprocket ###1
del sprocket.widget ###2
del sprocket
###2
will call sprocket.widget.__deallocate__()
and it doesn't deallocates sprocket.widget.cycle
, so the sprocket will survive the widget
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With