I am reading some line segments from cin. Each line segment is represented by a start and end point. 2D. X and Y.
The input is not sorted. It is in random order. (Update:But I need them sorted first by X and then by Y)
I can read in all the segments, store them in a vector and then call std::sort. On the other hand, I can create an empty std::set and insert each segment as it arrives. The set will automatically maintain sorted order. Which of the two approaches is more efficient?
Update: The total size of the input (number of segments) is known in advance.
In general, std::sort is indeed faster than qsort because of a couple of these things: qsort operates on void* , which first requires a dereference, and second requires the size of the data type to perform the swaps. Therefore, the swap operation of qsort is done every byte.
std::set is an associative container that contains a sorted set of unique objects of type Key . Sorting is done using the key comparison function Compare.
std::sort() is a built-in function in C++'s Standard Template Library. The function takes in a beginning iterator, an ending iterator, and (by default) sorts the iterable in ascending order. The function can also be used for custom sorting by passing in a comparator function that returns a boolean.
std::sort is a generic algorithm that's supposed to work for anything that provides iterators. However, it really expects a random access iterator to sort efficiently. std::list , being a linked list, cannot provide random access to its elements efficiently. That's why it provides its own specialized sort algorithm.
You should measure the performance of both approaches to be sure, but it's a safe bet to assume that std::sort
on an std::vector
is way faster than inserting into an std::set
due to locality effects and the large constants hiding in the tree insertion algorithm. Also, the subsequent lookups and iteration will be faster.
(However, std::set
is better suited for supporting a mixed series of insertions and deletions/lookups/iterations. Maintaining order in vector is expensive, as each insertion will take linear time on average.)
As a good rule of thumb, the stricter guarantees are offered, the worse performance you'll get.
Inserting into a std::set
guarantees that the sequence is sorted after every insertion.
Inserting into a std::vector
, and calling std::sort
once after all insertions have been done guarantees that the sequence is sorted once all manipulations on the vector
have been done. It doesn't require the vector to be sorted during all the intermediate insertions.
A std::vector
also exhibits better spatial locality, and requires fewer memory allocations.
So I would assume the vector
approach to be faster, but if performance matters to you, then it matters enough to be measured.
If you don't care to measure what is faster in your case for your data sets with your code in your application, then you don't care which is faster.
Use the container that has the appropriate semantics for your needs. Efficiency generally follows on automatically from that choice.
If you then experience performance bottlenecks, do some benchmarking.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With