Two, maybe trivial questions:
Really. I spent the last three days implementing something faster than std::sort, just for the sake of doing it. It is supposed to be an introsort, and I suspect it uses the single pivot version quicksort inside. Epic fail. Mine was at least twice as slow.
In my utter bitterness I even copy-pasted other - top notch - programmers code. No avail. I benchmarked my other algorithms too... My binary search, and upper_bound, lower_bound versions are so stripped down it couldn't really be made with less instructions. Still, they are about twice as slow.
I ask, why, why, why? And this leads me to my next question...
Of course, I want to look at their sources! Is it even possible to write more efficient code than those, or am I at an abstraction level with my "simple" main.cpp where I can not reach optimisations utilized by the STL library?
I mean for example... Let's take the maps... wich are simple associative containers. The documentation says it is implemented with a red-black tree. Now... would it worth it to try implement my own red-black tree, or they took this joy :-) away from me and I should just throw every data I get my hands on into the map container?
I hope this does make sense. If not, please forgive me.
The short answer is "if it was possible to write faster code which did the same thing, then the standard library would have done it already".
The standard library is designed by clever people, and the reason it was made part of C++ is that other clever people recognized it as being clever. And since then, 15 years have passed in which other clever people tried to take these specifications and write the absolutely most efficient code to implement it that they could.
That's a lot of cleverness you're trying to compete with. ;)
So there is no magic in the STL, they don't cheat, or use tricks unavailable to you. It is just very carefully designed to maximize performance.
The thing about C++ is that it's not a fast language as such. If you're not careful, it is easy to introduce all sorts of inefficiencies: virtual function calls, cache misses, excessive memory allocations, unnecessary copying of objects, all of this can cripple the performance of C++ code if you're not careful.
With care, you can write code that's about as efficient as the STL. It's not that special. But in general, the only way you're going to get faster code is to change the requirements. The standard library is required to be general, to work as well as possible across all use cases. If your requirement is more specific, it is sometimes possible to write specialized code that favors those specific cases. But then the tradeoff is that the code either will not work, or will be inefficient, in other cases.
A final point is that a key part of the reason why the STL is so clever, and why it was adopted into the standard, is that it it is pretty much zero-overhead. The standard libraries in many languages are "fast enough", but not as fast as hand-rolled code. They have a sorting algorithm, but it's not quite as fast as if you wrote it yourself in-place. It might use a few casts to and from a common "object" base class, or maybe use boxing on value types. The STL is designed so that pretty much everything can be inlined by the compiler, yielding code equivalent to if you'd hand-rolled it yourself. It uses templates to specialize for the type you're using, so there's no overhead of converting to a type understood by the container or algorithm.
That's why it's hard to compete with. It is a ridiculously efficient libary, and it had to be. With the mentality of your average C or C++ programmer, especially 10-15 years ago, no one would ever use a std::vector
if it was 5% slower than a raw array. No one would use iterators and std algorithms if they weren't as fast as just writing the loop yourself.
So the STL pioneered a lot of clever C++ tricks in order to become just as efficient as hand-rolled C code.
They are probably optimized to a great extent. Such implementations consider memory page faults, cache misses etc.
Getting the source of those implementations depends on the compiler they are shipped with. I think most compilers (even Microsoft) will allow you to see them.
I think the most important things to know are the architecture you are compiling to and the operating system (if any) your program will be running on. Understanding these things will allow you to precisely target the hardware.
There are also countless optimization techniques. This is a good summary. Also, global optimization is whole science, so there are certainly many things to learn.
There are some clever things on this site, too. Sok sikert!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With