Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C++ shared_ptr vs. Python object

AFAIK, the use of shared_ptr is often discouraged because of potential bugs caused by careless usage of them (unless you have a really good explanation for significant benefit and carefully checked design).

On the other hand, Python objects seem to be essentially shared_ptrs (ref_count and garbage collection).

I am wondering what makes them work nicely in Python but potentially dangerous in C++. In other words, what are the differences between Python and C++ in dealing with shared_ptr that makes their usage discouraged in C++ but not causing similar problems in Python?

I know e.g. Python automatically detects cycles between objects which prevents memory leaks that dangling cyclic shared_ptrs can cause in C++.

like image 745
Kaveh Avatar asked Mar 02 '26 02:03

Kaveh


1 Answers

"I know e.g. Python automatically detects cycles" -- that's what makes them work nicely, at least so far as the "potential bugs" relate to memory leaks.

Besides which, C++ programs are more commonly written under tight performance constraints than Python programs (which IMO is a combination of different genuine requirements with some fairly bogus differences in rules-of-thumb, but that's another story). A fairly high proportion of the Python objects I use don't strictly need reference counting, they have exactly one owner and a unique_ptr would be fine (or for that matter a data member of class type). In C++ it's considered (by the people writing the advice you're reading) worth taking the performance advantage and the explicitly simplified design. In Python it's usually not considered a problem, you pay the performance and you keep the flexibility to decide later that it's shared after all without any code change required (other than to take additional references that outlive the original, I mean).

Btw in any language, shared mutable objects have "potential bugs" associated with them, if you lose track of what objects will or won't change when you're not looking at them. I don't just mean race conditions: even in a single-threaded program you need to be aware that C++ Predicates shouldn't change anything and that you (often) can't mutate a container while iterating over it. I don't see this as a difference between C++ and Python, though. Rather, to some extent you should be slightly wary of shared objects in Python too, and when you proliferate references to an object at least understand why you're doing it.

So, on to the list of issues in the question you link to:

  • cyclic references -- as mentioned, Python rolls its sleeves up, finds them and frees them. For reasons to do with the design and specific uses of the languages, cycle-breaking garbage collection is rather difficult to implement in C++, although not impossible.
  • creating multiple unrelated shared_ptrs to the same object -- no analog is possible in Python, since the reference-counter isn't open to the user to mess up.
  • Constructing an anonymous temporary shared pointer -- doesn't arise in Python, there's no risk of a memory leak that way in Python since there's no "gap" in which the object exists but is not yet subject to collection if it becomes unreferenced.
  • Calling the get() function to get the raw pointer and use it after the pointed-to object goes out of scope -- well, you can mess this up if you're writing Python/C, but not in pure Python.
  • Passing a reference of or a raw pointer to a shared_ptr should be dangerous too, since it won't increment the internal count -- there's no means in Python to add a reference without the language taking care of the refcount.
  • we passed 'this' to some thread workers instead of 'shared_from_this' -- in other words, forgot to create a shared_ptr when needed. Can't do this in Python.
  • most of the predicates you know and love from <functional> don't play nicely with shared_ptr -- Python refcounting is so built in to the runtime (or I suppose to be precise I should say: garbage collection is so built in to the language design) that there are no libraries that fail to cope with it.
  • Using shared_ptr for really small objects (like char short) could be an overhead -- issue exists in Python, and Python programmers generally don't sweat it. If you need an array of "primitive type" then you can use numpy to reduce overhead. Sometimes Python programs run out of memory and you need to do something about it, that's life ;-)
  • Giving out a shared_ptr< T > to this inside a class definition is also dangerous. Use enabled_shared_from_this instead -- it may not be obvious, but this is "don't create multiple unrelated shared_ptr to the same object" again.
  • You need to be careful when you use shared_ptr in multithread code -- it's possible to create race conditions in Python too, this is part of "shared mutable objects are tricksy".

Most of this is to do with the fact that in C++ you have to explicitly do something to get refcounting, and you don't get it if you don't ask for it. This provides several opportunities for error that Python doesn't make available to the programmer because it just does it for you. If you use shared_ptr correctly then apart from the existence of libraries that don't co-operate with it, none of these problems comes up in C++ either. Those who are cautious of using it for these reasons are basically saying they're afraid they'll use it incorrectly, or at any rate more afraid than that they'll misuse some alternative. Much of C++ programming is trading different potential bugs off against each other until you come up with a design that you consider yourself competent to execute. Furthermore it has "don't pay for what you don't need" as a design philosophy. Between these two factors, you don't do anything without a really good explanation, a significant benefit, and a carefully checked design. shared_ptr is no different ;-)

like image 168
Steve Jessop Avatar answered Mar 03 '26 16:03

Steve Jessop



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!