As the title says, I have in my mind some methods to do it but I don't know which is fastest.
So let's say that we have a: vector<int> vals
with some values
1
After my vals
are added
sort(vals.begin(), vals.end());
auto last = unique(vals.begin(), vals.end());
vals.erase(last, vals.end());
2
Convert to set after my vals
are added:
set<int> s( vals.begin(), vals.end() );
vals.assign( s.begin(), s.end() );
3
When i add my vals
, i check if it's already in my vector:
if( find(vals.begin(), vals.end(), myVal)!=vals.end() )
// add my val
4
Use a set from start
Ok, I've got these 4 methods, my questions are:
1 From 1, 2 and 3 which is the fastest?
2 Is 4 faster than the first 3?
3 At 2 after converting the vector to set, it's more convenabile to use the set to do what I need to do or should I do the vals.assign( .. )
and continue with my vector?
Question 1: Both 1 and 2 are O(n log n), 3 is O(n^2). Between 1 and 2, it depends on the data.
Question 2: 4 is also O(n log n) and can be better than 1 and 2 if you have lots of duplicates, because it only stores one copy of each. Imagine a million values that are all equal.
Question 3: Well, that really depends on what you need to do.
The only thing that can be said without knowing more is that your alternative number 3 is asymptotically worse than the others.
If you're using C++11 and don't need ordering, you can use std::unordered_set
, which is a hash table and can be significantly faster than std::set
.
Option 1 is going to beat all the others. The complexity is just O(N log N) and the contiguous memory of vector keeps the constant factors low.
std::set typically suffers a lot from non-contiguous allocations. It's not just slow to access those, just creating them takes significant time as well.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With