Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Performance of C++ Google dense_hash_set inserts

Tags:

c++

hashmap

I have a C++ program that is inserting approx. 18 million numbers of type uint64_t into a google dense_hash_set.

The numbers are all the even numbers below 2^64 with the property

N >= radical(N)^4.

The inserts are an order of magnitude slower than, for example, inserting 18 million random numbers, or 18 million sequential numbers.

When performing the inserts, the code seems to spend most of its time executing the statement

if ( test_empty(bucknum) )

Is 18 million a sensible number of items to insert into a dense_hash_set?

Is there any way of speeding up the inserts?

Relevant lines are

uint64_t N;
google::dense_hash_set<uint64_t> evencandidates;
evencandidates.set_empty_key(-1);
.....
evencandidates.insert(N);
like image 477
Arthur Vause Avatar asked Dec 08 '25 22:12

Arthur Vause


1 Answers

Solved by replacing the default hash function with std::tr1::hash. The declaration of the dense hash set becoming:

google::dense_hash_set<uint64_t, std::tr1::hash<uint64_t> > evencandidates;

The criterion for choosing numbers to store,

N >= radical(N)^4 and N even

results in the numbers being stored have a lot of common factors, particluarly several powers of 2 for most numbers. Choosing a hash function that works well with this set of numbers solves the performance problem.

like image 182
Arthur Vause Avatar answered Dec 10 '25 11:12

Arthur Vause



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!