I'm just trying to understand how to deal with the reference counts when using the Python C API. I want to call a Python function in C++, like this: <pre class="prettyprint"><code>PyObject* script; PyObject* scriptRun; PyObject* scriptResult; // import module script = PyImport_ImportModule("pythonScript"); // get function objects scriptRun = PyObject_GetAttrString(script, "run"); // call function without/empty arguments scriptResult = PyObject_CallFunctionObjArgs(scriptRun, NULL); if (scriptResult == NULL) cout << "scriptResult = null" << endl; else cout << "scriptResult != null" << endl; cout << "print reference count: " << scriptResult->ob_refcnt << endl; </code></pre> The Python code in pythonScript.py is very simple: <pre class="prettyprint"><code>def run(): return 1 </code></pre> The documentation of "PyObject_CallFunctionObjArgs" says that you get a new reference as return value. So I would expect "scriptResult" to have a reference count of 1. However the output is: <pre class="prettyprint"><code>scriptResult != null print reference count: 72 </code></pre> Furthermore I would expect a memory leak if I would do this in a loop without decreasing the reference count. However this seems not to happen. Could someone help me understand? Kind regards!

The confusion is that small integers (also <code>True</code>, <code>False</code>, <code>None</code>, single-character strings, etc.) are interned ( "is" operator behaves unexpectedly with integers ), which means that wherever they are used or obtained in a program the runtime will try to use the same object instance: <pre class="prettyprint"><code>>>> 1 is 1 True >>> 1 + 1 is 2 True >>> 1000 + 1 is 1001 False </code></pre> This means that when you write <code>return 1</code>, you're returning an already existing <code>int</code> object instance with (as you've seen) a considerable reference count. Because the same instance is used elsewhere, failing to dereference it won't result in a memory leak. If you change your script to <code>return 1001</code> or <code>return object()</code> then you will see an initial reference count of 1 and a memory leak.

Python C API, High reference count for new Object

Tags:

python

python-c-api

automatic-ref-counting

I'm just trying to understand how to deal with the reference counts when using the Python C API.

I want to call a Python function in C++, like this:

PyObject* script;
PyObject* scriptRun;
PyObject* scriptResult;

// import module
script = PyImport_ImportModule("pythonScript");
// get function objects
scriptRun = PyObject_GetAttrString(script, "run");
// call function without/empty arguments
scriptResult = PyObject_CallFunctionObjArgs(scriptRun, NULL);

if (scriptResult == NULL)
    cout << "scriptResult  = null" << endl;
else
    cout << "scriptResult  != null" << endl;

cout << "print reference count: " << scriptResult->ob_refcnt << endl;

The Python code in pythonScript.py is very simple:

def run():
    return 1

The documentation of "PyObject_CallFunctionObjArgs" says that you get a new reference as return value. So I would expect "scriptResult" to have a reference count of 1. However the output is:

scriptResult  != null
print reference count: 72

Furthermore I would expect a memory leak if I would do this in a loop without decreasing the reference count. However this seems not to happen.

Could someone help me understand?

Kind regards!

206

asked Oct 25 '12 12:10

user1774143

2 Answers

The confusion is that small integers (also True, False, None, single-character strings, etc.) are interned ( "is" operator behaves unexpectedly with integers ), which means that wherever they are used or obtained in a program the runtime will try to use the same object instance:

>>> 1 is 1
True
>>> 1 + 1 is 2
True
>>> 1000 + 1 is 1001
False

This means that when you write return 1, you're returning an already existing int object instance with (as you've seen) a considerable reference count. Because the same instance is used elsewhere, failing to dereference it won't result in a memory leak.

If you change your script to return 1001 or return object() then you will see an initial reference count of 1 and a memory leak.

answered Oct 18 '22 23:10

ecatmur

ecatmur is right, numbers and strings are interned in Python, so instead you can try with a simple object() object.

A simple demo with gc:

import gc


def run():
    return 1

s = run()
print len(gc.get_referrers(s))  # prints a rather big number, 41 in my case

obj = object()
print len(gc.get_referrers(obj))  # prints 1

lst = [obj]
print len(gc.get_referrers(obj))  # prints 2

lst = []
print len(gc.get_referrers(obj))  # prints 1 again

A bit more: when CPython creates a new object, it calls a C macro _Py_NewReference to initialize the reference count to 1. Then uses Py_INCREF(op) and Py_DECREF(op) to increase and decrease the reference count.

answered Oct 18 '22 23:10

K Z

Related questions
                            
                                interpolation with matplotlib pcolor
                            
                                Combining job results in celery
                            
                                What are the best ways to compare the contents of two list-like objects?
                            
                                Why Scikit GradientBoostingClassifier won't let me use least squares regression?
                            
                                Remember form data for pagination
                            
                                How can I get Python's unittest to not catch exceptions?
                            
                                Python as "perl -pe", execute Python command for every line in stdin [duplicate]
                            
                                Constructing a tree using Python
                            
                                Sockjs - Send message to sockjs-tornado in Python code
                            
                                Python's os.chdir() and os.getcwd() mismatch when using tempfile.mkdtemp() on Mac OSX Lion
                            
                                What is the best way to check if time is within a certain minute?
                            
                                SqlAlchemy: export table to new database
                            
                                Press multiple keys at once to get my character to move diagonally
                            
                                Pydoc messes up with -*- coding: utf-8 -*-
                            
                                Integer object whose value can be changed after definition?
                            
                                Python accent graves bad practice?
                            
                                Running time using Big Θ notation
                            
                                How to put my Python C-module inside package?
                            
                                R_PPC_REL24 relocation out of range
                            
                                Create a column which increments based on another column in Python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With