For the past few years, I've generally accepted that
if I am going to use ref-counted smart pointers
invasive smart pointers is the way to go
--
However, I'm starting to like non-invasive smart pointers due to the following:
The only reason why this Foo moving objects around in non-invasive smart pointers being easier than invasive smart pointers is:
In non-invasive smart pointers, there is only one pointer that points to each Foo.
In invasive smart pointers, I have no idea how many objects point to each Foo.
Now, the only cost of non-invasive smart pointers ... is the double indirection. [Perhaps this screws up the caches].
Does anyone have a good study of expensive this extra layer of indirection is?
EDIT: by smart pointers, I may be referring to what others call "shared-pointers"; the whole idea is: there is a reference-count attached to objects, and when it hits 0, the object is automatically deleted
There are several important difference between invasive or non-invasive pointers:
The biggest advantage of second (non-invasive):
shared_ptr
/weak_ptr
).Advantage of first is when you need to get smart pointer on this (at least in case of boost::shared_ptr
, std::tr1::shared_ptr
)
shared_ptr
from this in constructor and destructor.I don't know of a study on the additional cost due to non-invasive over invasive. But I would note that non-invasive seems to be universally recommended by the C++ experts. Of course, this may mean nothing! But the reasoning is pretty sound: if you need smart pointers, it's because you want a simpler way to implement object lifetime management instead of writing it by hand, so you are emphasising correctness and simplicity over performance, which is always a good idea until you have profiled a realistic model of your whole design.
It could well be that in a simplified test, non-invasive is twice as slow as invasive, and yet in a real program that does actual work, this speed difference is lost in the noise and becomes so insignificant you can't even measure it. This is quite a common phenomenon; the things you imagine are significant to performance very often aren't.
If you find a performance bottleneck, it's possible (likely?) that the work of maintaining the reference count itself (in both approaches) will have as much effect on performance as the extra indirection in the non-invasive approach. With raw pointers, the statement:
p1 = p2;
probably only needs to move a value between two CPU registers, after the optimiser has worked its magic. But if they are reference counting smart pointers, even with invasive it's like:
if (p1 != p2)
{
if ((p1 != 0) && (--(p1->count) == 0))
delete p1;
p1 = p2;
if (p1 != 0)
p1->count++;
}
This happens with every smart pointer argument passed to every function. So there are lots of extra accesses to potentially far-flung areas of memory, to raise and lower the count each time. And to be thread-safe, the increment and decrement-check operations have to be interlocked/atomic, which can have a severely negative effect on multiple cores.
I think of the "sweet spot" of C++ as being those situations where you don't need to manage wildly dynamic data structures like this. Instead you have a simple hierarchical pattern of object ownership, so there is an obvious single owner of each object, and the lifetime of data tends to follow the lifetime of function invocations (more often than not). You can then let the standard containers and the function call stack manage everything for you. This is emphasised in the forthcoming version of the language with rvalue references, unique_ptr
, etc. which is all about transferring around the single-ownership of an object in an easy way. If you really need dynamic multi-owner lifetime management, true GC would be faster and easier to use correctly, but C++ isn't a very happy home for GC.
Another minor point: unfortunately "In non-invasive smart pointers, there is only one pointer that points to each Foo" is untrue. Inside Foo
there is the this
pointer that is a Foo *
and so naked pointers can still be leaked out, often in quite hard-to-spot ways.
Well, first and foremost, I shall remind you that shared ownership is usually a difficult beast to tame and can lead to quite difficult to root out bugs.
There are many ways not to have shared ownership. The Factory
approach (implemented itself using a Boost Pointer Container) is personally one of my favorite.
Now, as for the reference counting goes....
1. Intrusive Pointers
The counter is embedded in the object itself which means:
weak_ptr
, so you cannot have cycles of references in your design without using an Observer
pattern... quite complicated2. Non-Intrusive Pointers
I will only talk about boost::shared_ptr
and boost::weak_ptr
. I have dug into the source a bit recently to precisely look at the mechanics and there are indeed much more complicated that the above!
// extract of <boost/shared_ptr.hpp>
template <class T>
class shared_ptr
{
T * px; // contained pointer
boost::detail::shared_count pn; // reference counter
};
weak_ptr
in case of cyclic references.shared_ptr
object needs to know about the object destructor (see Example)Here is a little example, to illustrate this forward declaration magic:
// foofwd.h
#include <boost/shared_ptr.hpp>
class Foo;
typedef boost::shared_ptr<Foo> foo_ptr;
foo_ptr make_foo();
// foo.h
#include "foofwd.h"
class Foo { /** **/ };
// foo.cpp
#include "foo.h"
foo_ptr make_foo() { return foo_ptr(new Foo()); }
// main.cpp
#include "foofwd.h"
int main(int argc, char* argv[])
{
foo_ptr p = make_foo();
} // p.get() is properly released
There is a little bit of template magic to authorized this. Basically the counter object embeds a disposer*
(yet a third allocation) that allows some type erasure to take place. Really useful though, since it really allows for forward declaration!
3. Conclusion
While I agree that Intrusive Pointers are probably faster as in less allocation going on (there are 3 different blocks of memory allocated for a shared_ptr
), there are also less practical.
So I'd like to point you out to the Boost Intrusive Pointer library, and more especially to its introduction:
As a general rule, if it isn't obvious whether
intrusive_ptr
better fits your needs thanshared_ptr
, try ashared_ptr
-based design first.
The only real cost of non-invasive ref-counting w.r.t. performance is that you sometimes require one extra allocation for the ref-counter. As far as I know, tr1::shared_ptr implementations don't do "double indirection". I suppose, it would be difficult to support conversions without letting shared_ptr store the pointer directly. A sensible shared_ptr implementation would store two pointers: One pointer to the object (no double indirection) and one pointer to some control structure.
Even the allocation overhead isn't necessary in all situations. See make_shared. C++0x will also provide a make_shared
function that allocates both, the object and the ref-counter in one go which is similar in that respect to the intrusive ref-counting alternative.
[...] Besides convenience and style, such a function is also exception safe and considerably faster because it can use a single allocation for both the object and its corresponding control block, eliminating a significant portion of shared_ptr's construction overhead. This eliminates one of the major efficiency complaints about shared_ptr. [...]
In the light of shared_ptr
and make_shared
, I have a hard time coming up with problems where an intrusive smart pointer would beat shared_ptr
significantly. Copying and destroying shared pointers might be a tad slower, though. Having said that, let me add that I rarely use these kinds of smart pointers. Most of the time unique owhership is all I need.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With