I'm just trying to understand how to deal with the reference counts when using the Python C API.
I want to call a Python function in C++, like this:
PyObject* script;
PyObject* scriptRun;
PyObject* scriptResult;
// import module
script = PyImport_ImportModule("pythonScript");
// get function objects
scriptRun = PyObject_GetAttrString(script, "run");
// call function without/empty arguments
scriptResult = PyObject_CallFunctionObjArgs(scriptRun, NULL);
if (scriptResult == NULL)
cout << "scriptResult = null" << endl;
else
cout << "scriptResult != null" << endl;
cout << "print reference count: " << scriptResult->ob_refcnt << endl;
The Python code in pythonScript.py is very simple:
def run():
return 1
The documentation of "PyObject_CallFunctionObjArgs" says that you get a new reference as return value. So I would expect "scriptResult" to have a reference count of 1. However the output is:
scriptResult != null
print reference count: 72
Furthermore I would expect a memory leak if I would do this in a loop without decreasing the reference count. However this seems not to happen.
Could someone help me understand?
Kind regards!
Every object in Python has a reference count and a pointer to a type. We can get the current reference count of an object with the sys module. You can use sys. getrefcount(object), but keep in mind that passing in the object to getrefcount() increases the reference count by 1.
PyObject is an object structure that you use to define object types for Python. All Python objects share a small number of fields that are defined using the PyObject structure. All other object types are extensions of this type. PyObject tells the Python interpreter to treat a pointer to an object as an object.
void Py_DECREF (PyObject *o) Decrement the reference count for object o. If the reference count reaches zero, the object's type's deallocation function (which must not be NULL ) is invoked. This function is usually used to delete a strong reference before exiting its scope.
The confusion is that small integers (also True
, False
, None
, single-character strings, etc.) are interned ( "is" operator behaves unexpectedly with integers ), which means that wherever they are used or obtained in a program the runtime will try to use the same object instance:
>>> 1 is 1
True
>>> 1 + 1 is 2
True
>>> 1000 + 1 is 1001
False
This means that when you write return 1
, you're returning an already existing int
object instance with (as you've seen) a considerable reference count. Because the same instance is used elsewhere, failing to dereference it won't result in a memory leak.
If you change your script to return 1001
or return object()
then you will see an initial reference count of 1 and a memory leak.
ecatmur is right, numbers and strings are interned in Python, so instead you can try with a simple object()
object.
A simple demo with gc
:
import gc
def run():
return 1
s = run()
print len(gc.get_referrers(s)) # prints a rather big number, 41 in my case
obj = object()
print len(gc.get_referrers(obj)) # prints 1
lst = [obj]
print len(gc.get_referrers(obj)) # prints 2
lst = []
print len(gc.get_referrers(obj)) # prints 1 again
A bit more: when CPython creates a new object, it calls a C macro _Py_NewReference
to initialize the reference count to 1. Then uses Py_INCREF(op)
and Py_DECREF(op)
to increase and decrease the reference count.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With