Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python C API, High reference count for new Object

I'm just trying to understand how to deal with the reference counts when using the Python C API.

I want to call a Python function in C++, like this:

PyObject* script;
PyObject* scriptRun;
PyObject* scriptResult;

// import module
script = PyImport_ImportModule("pythonScript");
// get function objects
scriptRun = PyObject_GetAttrString(script, "run");
// call function without/empty arguments
scriptResult = PyObject_CallFunctionObjArgs(scriptRun, NULL);

if (scriptResult == NULL)
    cout << "scriptResult  = null" << endl;
else
    cout << "scriptResult  != null" << endl;

cout << "print reference count: " << scriptResult->ob_refcnt << endl;

The Python code in pythonScript.py is very simple:

def run():
    return 1

The documentation of "PyObject_CallFunctionObjArgs" says that you get a new reference as return value. So I would expect "scriptResult" to have a reference count of 1. However the output is:

scriptResult  != null
print reference count: 72

Furthermore I would expect a memory leak if I would do this in a loop without decreasing the reference count. However this seems not to happen.

Could someone help me understand?

Kind regards!

like image 206
user1774143 Avatar asked Oct 25 '12 12:10

user1774143


People also ask

Do Python objects have a reference count?

Every object in Python has a reference count and a pointer to a type. We can get the current reference count of an object with the sys module. You can use sys. getrefcount(object), but keep in mind that passing in the object to getrefcount() increases the reference count by 1.

What is PyObject in Python?

PyObject is an object structure that you use to define object types for Python. All Python objects share a small number of fields that are defined using the PyObject structure. All other object types are extensions of this type. PyObject tells the Python interpreter to treat a pointer to an object as an object.

What is Py_DECREF?

void Py_DECREF (PyObject *o) Decrement the reference count for object o. If the reference count reaches zero, the object's type's deallocation function (which must not be NULL ) is invoked. This function is usually used to delete a strong reference before exiting its scope.


2 Answers

The confusion is that small integers (also True, False, None, single-character strings, etc.) are interned ( "is" operator behaves unexpectedly with integers ), which means that wherever they are used or obtained in a program the runtime will try to use the same object instance:

>>> 1 is 1
True
>>> 1 + 1 is 2
True
>>> 1000 + 1 is 1001
False

This means that when you write return 1, you're returning an already existing int object instance with (as you've seen) a considerable reference count. Because the same instance is used elsewhere, failing to dereference it won't result in a memory leak.

If you change your script to return 1001 or return object() then you will see an initial reference count of 1 and a memory leak.

like image 98
ecatmur Avatar answered Oct 18 '22 23:10

ecatmur


ecatmur is right, numbers and strings are interned in Python, so instead you can try with a simple object() object.

A simple demo with gc:

import gc


def run():
    return 1

s = run()
print len(gc.get_referrers(s))  # prints a rather big number, 41 in my case

obj = object()
print len(gc.get_referrers(obj))  # prints 1

lst = [obj]
print len(gc.get_referrers(obj))  # prints 2

lst = []
print len(gc.get_referrers(obj))  # prints 1 again

A bit more: when CPython creates a new object, it calls a C macro _Py_NewReference to initialize the reference count to 1. Then uses Py_INCREF(op) and Py_DECREF(op) to increase and decrease the reference count.

like image 37
K Z Avatar answered Oct 18 '22 23:10

K Z