Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

When to Py_INCREF?

Tags:

I'm working on a C extension and am at the point where I want to track down memory leaks. From reading Python's documentation it's hard to understand when to increment / decrement reference count of Python objects. Also, after couple days spending trying to embed Python interpreter (in order to compile the extension as a standalone program), I had to give up this endeavor. So, tools like Valgrind are helpless here.

So far, by trial and error I learned that, for example, Py_DECREF(Py_None) is a bad thing... but is this true of any constant? I don't know.

My major confusions so far can be listed like this:

  1. Do I have to decrement refcount on anything created by PyWhatever_New() if it doesn't outlive the procedure that created it?
  2. Does every Py_INCREF need to be matched by Py_DECREF, or should there be one more of one / the other?
  3. If a call to Python procedure resulted in a PyObject*, do I need to increment it to ensure that I can still use it (forever), or decrement it to ensure that eventually it will be garbage-collected, or neither?
  4. Are Python objects created through C API on the stack allocated on stack or on heap? (It is possible that Py_INCREF reallocates them on heap for example).
  5. Do I need to do anything special to Python objects created in C code before passing them to Python code? What if Python code outlives C code that created Python objects?
  6. Finally, I understand that Python has both reference counting and garbage collector: if that's the case, how critical is it if I mess up the reference count (i.e. not decrement enough), will GC eventually figure out what to do with those objects?
like image 954
wvxvw Avatar asked May 14 '18 18:05

wvxvw


People also ask

What is Py_INCREF?

void Py_INCREF (PyObject *o) Increment the reference count for object o. This function is usually used to convert a borrowed reference to a strong reference in-place. The Py_NewRef() function can be used to create a new strong reference.

What is PyObject?

PyObject is an object structure that you use to define object types for Python. All Python objects share a small number of fields that are defined using the PyObject structure. All other object types are extensions of this type. PyObject tells the Python interpreter to treat a pointer to an object as an object.

Is Python reference counted?

Reference counting in CPython At a very basic level, a Python object's reference count is incremented whenever the object is referenced, and it's decremented when an object is dereferenced. If an object's reference count is 0, the memory for the object is deallocated.


1 Answers

Most of this is covered in Reference Count Details, and the rest is covered in the docs on the specific questions you're asking about. But, to get it all in one place:

Py_DECREF(Py_None) is a bad thing... but is this true of any constant?

The more general rule is that calling Py_DECREF on anything you didn't get a new/stolen reference to, and didn't call Py_INCREF on, is a bad thing. Since you never call Py_INCREF on anything accessible as a constant, this means you never call Py_DECREF on them.

Do I have to decrement refcount on anything created by PyWhatever_New()

Yes. Anything that returns a "new reference" has to be decremented. By convention, anything that ends in _New should return a new reference, but it should be documented anyway (e.g., see PyList_New).

Does every Py_INCREF need to be matched by Py_DECREF, or should there be one more of one / the other?

The number in your own code may not necessarily balance. The total number has to balance, but there are increments and decrements happening inside Python itself. For example, anything that returns a "new reference" has already done an inc, while anything that "steals" a reference will do the dec on it.

Are Python objects created through C API on the stack allocated on stack or on heap? (It is possible that Py_INCREF reallocates them on heap for example).

There's no way to create objects through C API on the stack. The C API only has functions that return pointers to objects.

Most of these objects are allocated on the heap. Some are actually in static memory.

But your code should not care anyway. You never allocate or delete them; they get allocated in the PySpam_New and similar functions, and deallocate themselves when you Py_DECREF them to 0, so it doesn't matter to you where they are.

(The except is constants that you can access via their global names, like Py_None. Those, you obviously know are in static storage.)

Do I need to do anything special to Python objects created in C code before passing them to Python code?

No.

What if Python code outlives C code that created Python objects?

I'm not sure what you mean by "outlives" here. Your extension module is not going to get unloaded while any objects depend on its code. (In fact, until at least 3.8, your module probably never going to get unloaded until shutdown.)

If you just mean the function that _New'd up an object returning, that's not an issue. You have to go very far out of your way to allocate any Python objects on the stack. And there's no way to pass things like a C array of objects, or a C string, into Python code without converting them to a Python tuple of objects, or a Python bytes or str. There are a few cases where, e.g., you could stash a pointer to something on the stack in a PyCapsule and pass that—but that's the same as in any C program, and… just don't do it.

Finally, I understand that Python has both reference counting and garbage collector

The garbage collector is just a cycle breaker. If you have objects that are keeping each other alive with a reference cycle, you can rely on the GC. But if you've leaked references to an object, the GC will never clean it up.

like image 166
abarnert Avatar answered Sep 28 '22 06:09

abarnert