I've got the following problem. I have a game which runs on average 60 frames per second. Each frame I need to store values in a container and there must be no duplicates.
It probably has to store less than 100 items per frame, but the number of insert-calls will be alot more (and many rejected due to it has to be unique). Only at the end of the frame do I need to traverse the container. So about 60 iterations of the container per frame, but alot more insertions.
Keep in mind the items to store are simple integer.
There are a bunch of containers I can use for this but I cannot make up my mind what to pick. Performance is the key issue for this.
Some pros/cons that I've gathered:
vector
set
unordered_set
I'm leaning on going the vector-route because of memory access patterns, even though set is clearly meant for this issue. The big issue that is unclear to me is whether traversing the vector for each insert is more costly than the allocations/deallocations (especially considering how often this must be done) and the memory lookups of set.
I know ultimately it all comes down to profiling each case, but if nothing else than as a headstart or just theoretically, what would probably be best in this scenario? Are there any pros/cons I might've missed aswell?
EDIT: As I didnt mention, the container is cleared() at the end of each frame
Set data structure is used to store unique values only, meaning no duplicate values would be stored in a set. When a HashSet is created, it internally implements a HashMap. An element can be inserted into the HashSet using the 'add' function.
Container library as flat_set .
Conclusion. The Set object is a great tool to have in your arsenal. Its ability to store unique values, especially when converting from other data types, makes it a great choice for quickly pulling out information that can be bogged down through repetitive noise.
So, the conclusion, use std::unordered_set or std::unordered_map (if you need the key-value feature). And you don't need to check before doing the insertion, these are unique-key containers, they don't allow duplicates.
I did timing with a few different methods that I thought were likely candidates. Using std::unordered_set
was the winner.
Here are my results:
Using UnorderedSet: 0.078s Using UnsortedVector: 0.193s Using OrderedSet: 0.278s Using SortedVector: 0.282s
Timing is based on the median of five runs for each case.
compiler: gcc version 4.9.1 flags: -std=c++11 -O2 OS: ubuntu 4.9.1 CPU: Intel(R) Core(TM) i5-4690K CPU @ 3.50GHz
Code:
#include <algorithm>
#include <chrono>
#include <cstdlib>
#include <iostream>
#include <random>
#include <set>
#include <unordered_set>
#include <vector>
using std::cerr;
static const size_t n_distinct = 100;
template <typename Engine>
static std::vector<int> randomInts(Engine &engine,size_t n)
{
auto distribution = std::uniform_int_distribution<int>(0,n_distinct);
auto generator = [&]{return distribution(engine);};
auto vec = std::vector<int>();
std::generate_n(std::back_inserter(vec),n,generator);
return vec;
}
struct UnsortedVectorSmallSet {
std::vector<int> values;
static const char *name() { return "UnsortedVector"; }
UnsortedVectorSmallSet() { values.reserve(n_distinct); }
void insert(int new_value)
{
auto iter = std::find(values.begin(),values.end(),new_value);
if (iter!=values.end()) return;
values.push_back(new_value);
}
};
struct SortedVectorSmallSet {
std::vector<int> values;
static const char *name() { return "SortedVector"; }
SortedVectorSmallSet() { values.reserve(n_distinct); }
void insert(int new_value)
{
auto iter = std::lower_bound(values.begin(),values.end(),new_value);
if (iter==values.end()) {
values.push_back(new_value);
return;
}
if (*iter==new_value) return;
values.insert(iter,new_value);
}
};
struct OrderedSetSmallSet {
std::set<int> values;
static const char *name() { return "OrderedSet"; }
void insert(int new_value) { values.insert(new_value); }
};
struct UnorderedSetSmallSet {
std::unordered_set<int> values;
static const char *name() { return "UnorderedSet"; }
void insert(int new_value) { values.insert(new_value); }
};
int main()
{
//using SmallSet = UnsortedVectorSmallSet;
//using SmallSet = SortedVectorSmallSet;
//using SmallSet = OrderedSetSmallSet;
using SmallSet = UnorderedSetSmallSet;
auto engine = std::default_random_engine();
std::vector<int> values_to_insert = randomInts(engine,10000000);
SmallSet small_set;
namespace chrono = std::chrono;
using chrono::system_clock;
auto start_time = system_clock::now();
for (auto value : values_to_insert) {
small_set.insert(value);
}
auto end_time = system_clock::now();
auto& result = small_set.values;
auto sum = std::accumulate(result.begin(),result.end(),0u);
auto elapsed_seconds = chrono::duration<float>(end_time-start_time).count();
cerr << "Using " << SmallSet::name() << ":\n";
cerr << " sum=" << sum << "\n";
cerr << " elapsed: " << elapsed_seconds << "s\n";
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With