Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python C-API Object Allocation

I want to use the new and delete operators for creating and destroying my objects.

The problem is python seems to break it into several stages. tp_new, tp_init and tp_alloc for creation and tp_del, tp_free and tp_dealloc for destruction. However c++ just has new which allocates and fully constructs the object and delete which destructs and deallocates the object.

Which of the python tp_* methods do I need to provide and what must they do?

Also I want to be able to create the object directly in c++ eg "PyObject *obj = new MyExtensionObject(args);" Will I also need to overload the new operator in some way to support this?

I also would like to be able to subclass my extension types in python, is there anything special I need to do to support this?

I'm using python 3.0.1.

EDIT: ok, tp_init seems to make objects a bit too mutable for what I'm doing (eg take a Texture object, changing the contents after creation is fine, but change fundamental aspects of it such as, size, bitdept, etc will break lots of existing c++ stuff that assumes those sort of things are fixed). If I dont implement it will it simply stop people calling __init__ AFTER its constructed (or at least ignore the call, like tuple does). Or should I have some flag that throws an exception or somthing if tp_init is called more than once on the same object?

Apart from that I think ive got most of the rest sorted.

extern "C"
{
    //creation + destruction
    PyObject* global_alloc(PyTypeObject *type, Py_ssize_t items)
    {
        return (PyObject*)new char[type->tp_basicsize + items*type->tp_itemsize];
    }
    void global_free(void *mem)
    {
        delete[] (char*)mem;
    }
}
template<class T> class ExtensionType
{
    PyTypeObject *t;
    ExtensionType()
    {
        t = new PyTypeObject();//not sure on this one, what is the "correct" way to create an empty type object
        memset((void*)t, 0, sizeof(PyTypeObject));
        static PyVarObject init = {PyObject_HEAD_INIT, 0};
        *((PyObject*)t) = init;

        t->tp_basicsize = sizeof(T);
        t->tp_itemsize  = 0;

        t->tp_name = "unknown";

        t->tp_alloc   = (allocfunc) global_alloc;
        t->tp_free    = (freefunc)  global_free;
        t->tp_new     = (newfunc)   T::obj_new;
        t->tp_dealloc = (destructor)T::obj_dealloc;
        ...
    }
    ...bunch of methods for changing stuff...
    PyObject *Finalise()
    {
    ...
    }
};
template <class T> PyObjectExtension : public PyObject
{
...
    extern "C" static PyObject* obj_new(PyTypeObject *subtype, PyObject *args, PyObject *kwds)
    {
        void *mem = (void*)subtype->tp_alloc(subtype, 0);
        return (PyObject*)new(mem) T(args, kwds)
    }
    extern "C" static void obj_dealloc(PyObject *obj)
    {
        ~T();
        obj->ob_type->tp_free(obj);//most of the time this is global_free(obj)
    }
...
};
class MyObject : PyObjectExtension<MyObject>
{
public:
    static PyObject* InitType()
    {
        ExtensionType<MyObject> extType();
        ...sets other stuff...
        return extType.Finalise();
    }
    ...
};
like image 636
Fire Lancer Avatar asked Feb 21 '09 16:02

Fire Lancer


1 Answers

The documentation for these is at http://docs.python.org/3.0/c-api/typeobj.html and http://docs.python.org/3.0/extending/newtypes.html describes how to make your own type.

tp_alloc does the low-level memory allocation for the instance. This is equivalent to malloc(), plus initialize the refcnt to 1. Python has it's own allocator, PyType_GenericAlloc, but a type can implement a specialized allocator.

tp_new is the same as Python's __new__. It's usually used for immutable objects where the data is stored in the instance itself, as compared to a pointer to data. For example, strings and tuples store their data in the instance, instead of using a char * or a PyTuple *.

For this case, tp_new figures out how much memory is needed, based on the input parameters, and calls tp_alloc to get the memory, then initializes the essential fields. tp_new does not need to call tp_alloc. It can for example return a cached object.

tp_init is the same as Python's __init__. Most of your initialization should be in this function.

The distinction between __new__ and __init__ is called two-stage initialization, or two-phase initialization.

You say "c++ just has new" but that's not correct. tp_alloc corresponds a custom arena allocator in C++, __new__ corresponds to a custom type allocator (a factory function), and __init__ is more like the constructor. That last link discusses more about the parallels between C++ and Python style.

Also read http://www.python.org/download/releases/2.2/descrintro/ for details about how __new__ and __init__ interact.

You write that you want to "create the object directly in c++". That's rather difficult because at the least you'll have to convert any Python exceptions that occurred during object instantiation into a C++ exception. You might try looking at Boost::Python for some help with this task. Or you can use a two-phase initialization. ;)

like image 77
Andrew Dalke Avatar answered Oct 21 '22 11:10

Andrew Dalke