Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Shared pointers and the performance

I have been using shared pointers for soem time now, and I have performance issues in my program... So I'd like to know if shared pointers lead to performance decrease. If so, then how hard? Thanks alot.

My program is multi-threaded, using std::tr1::shared_ptr

like image 922
Guest Avatar asked Oct 12 '09 17:10

Guest


People also ask

Is shared pointer slow?

Admittedly, the std::shared_ptr is about two times slower than new and delete. Even std::make_shared has a performance overhead of about 10%.

What is the point of a shared pointer?

In C++, a shared pointer is one of the smart pointers. The shared pointer maintains a reference count which is incremented when another shared pointer points to the same object. So, when the reference count is equal to zero (i.e., no pointer points to this object), the object is destroyed.

Is unique_ptr slower?

unique_ptr, the default smart pointer in C++ is no slower than direct pointer access. If you require data sharing, then shared_ptr offers you that feature at the cost of reference counting.

What happens when a shared pointer goes out of scope?

The smart pointer has an internal counter which is decreased each time that a std::shared_ptr , pointing to the same resource, goes out of scope – this technique is called reference counting. When the last shared pointer is destroyed, the counter goes to zero, and the memory is deallocated.


3 Answers

If your app is passing around 700 byte XML messages that could be contained in 65 byte Google protocol messages or 85 byte ASN.1 messages then it probably is not going to matter. But if it is processing a million somethings a second then I would not dismiss the cost of adding 2 full read modify write (RMW) cycles to the passing of a pointer.

A full read modify write is on the order of 50 ns so two is 100 ns. This cost is the cost of a lock-inc and a lock-dec - the same as 2 CAS's. This is half of a windows critical section reserve and release. This is compared to a single one machine cycle push (400 PICO seconds on a 2.5GHZ machine)

And this does not even include the other costs for invalidating the cache line that actually contains the count, the effects of the BUS lock on other processors, etc etc.

Passing smart pointers by const reference is almost ALWAYS to be preferred. If the callee does not make a new shared pointer when he wants to-guarantee or control-of the lifetime of the pointee then it is a bug in the callee. To go willy-nilly passing thread safe reference counting smart pointers around by value is just asking for performance hits.

The use of reference counted pointers simplifies lifetimes no doubt, but to pass shared pointers by value to try to protect against defects in the callee is sheer and utter nonsense.

Excessive use of reference counting can in short order turn a svelte program that can process 1mm messages per second (mps) into a fat one that handles 150k mps on the same hardware. All of a sudden you need half a rack of servers and $10000/year in electricity.

You are always better off if you can manage the lifetimes of your objects without reference counting.

An example of a simple improvement is say if you are going to fanout an object and you know the breadth of the fanout(say n) increment by n rather that individually increment at each fanout.

BTW when the cpu sees a lock prefix, it really does say "Oh no this is going to hurt".

All that being said, I agree with everyone that you should verify the hot spot.

like image 75
pgast Avatar answered Sep 29 '22 21:09

pgast


It's virtually impossible to correctly answer this question given the data. The only way to truly tell what is causing a performance issue in your application is to run a profiler on the program and examine the output.

That being said, it's very unlikely that a shared_ptr is causing the slow down. The shared_ptr type and many early home grown variants are used in an ever increasing number of C++ programs. I myself use them in my work (professional at home). I've spent a lot of time profiling my work applications and shared_ptr hasn't ever been even close to a problem in my code or any other code running within the application. It's much more likely that the error is elsewhere.

like image 37
JaredPar Avatar answered Sep 29 '22 21:09

JaredPar


Shared pointers are reference counted. Particularly when you're using multi-threading, incrementing and decrementing the reference count can take a significant amount of time. The reason multithreading hurts here is that if you passed a shared pointer between threads, the reference count would end up shared between those threads, so any manipulation has to be synchronized between the threads. That can slow things down quite a bit.

Edit: For those who care about how much slower thread interlocking can make some fairly simple operations, see Herb Sutter's testing with a few implementations of CoW Strings. While his testing is far from perfect (e.g. he tested only on Windows), it still gives some idea about the kind of slow-down you can expect. For most practical purposes, you can/could think of a CoW string as something like a shared_ptr<charT>, with a lot of (irrelevant) member functions added.

like image 40
Jerry Coffin Avatar answered Sep 29 '22 21:09

Jerry Coffin