Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to optimize reusing a large std::unordered_map as a temporary in a frequently called function?

Simplified question with a working example: I want to reuse a std::unordered_map (let's call it umap) multiple times, similar to the following dummy code (which does not do anything meaningful). How can I make this code run faster?

#include <iostream>
#include <unordered_map>
#include <time.h>

unsigned size = 1000000;

void foo(){
    std::unordered_map<int, double> umap;
    umap.reserve(size);
    for (int i = 0; i < size; i++) {
        // in my real program: umap gets filled with meaningful data here
        umap.emplace(i, i * 0.1);
    }
    // ... some code here which does something meaningful with umap
}

int main() {

    clock_t t = clock();

    for(int i = 0; i < 50; i++){
        foo();
    }

    t = clock() - t;
    printf ("%f s\n",((float)t)/CLOCKS_PER_SEC);

    return 0;
}

In my original code, I want to store matrix entries in umap. In each call to foo, the key values start from 0 up to N, and N can be different in each call to foo, but there is an upper limit of 10M for indices. Also, values can be different (contrary to the dummy code here which is always i*0.1).

I tried to make umap a non-local variable, for avoiding the repeated memory allocation of umap.reserve() in each call. This requires to call umap.clear() at the end of foo, but that turned out to be actually slower than using a local variable (I measured it).

like image 799
Abaris Avatar asked Jan 24 '19 06:01

Abaris


People also ask

Is unordered_map faster than map?

Generally, an unordered_map in C++ is faster than map in C++ because the average time complexity for insertion, deletion, and updation is O(1) while in the case of map, the average time complexity for all the operations is O(log(n)) where n is the number of elements present inside the map.

Why do you use unordered_map why not ordered map?

map is used to store elements as key,value pairs in sorted order. unordered_map is used to store elements as key,value pairs in non-sorted order.

Why is unordered_map slow?

std::unordered_map is supposedly slow because it has fairly stringent iterator invalidation requirements. In my experience, unless you wring the most performance out of your code as you can, it's not a huge issue; it's generally faster than most casual implementations.

How are elements stored in unordered_map?

unordered_map is an associated container that stores elements formed by the combination of key-value and a mapped value. The key value is used to uniquely identify the element and the mapped value is the content associated with the key. Both key and value can be of any type predefined or user-defined.


1 Answers

I don't think there is any good way to accomplish what you're looking for directly -- i.e. you can't clear the map without clearing the map. I suppose you could allocate a number of maps up-front, and just use each one of them a single time as a "disposable map", and then go on to use the next map during your next call, but I doubt this would give you any overall speedup, since at the end of it all you'd have to clear all of them at once, and in any case it would be very RAM-intensive and cache-unfriendly (in modern CPUs, RAM access is very often the performance bottleneck, and therefore minimizing the number cache misses is the way to maximize effiency).

My suggestion would be that if clear-speed is so critical, you may need to move away from using unordered_map entirely, and instead use something simpler like a std::vector -- in that case you can simply keep a number-of-valid-items-in-the-vector integer, and "clearing" the vector is a matter of just setting the count back to zero. (Of course, that means you sacrifice unordered_map's quick-lookup properties, but perhaps you don't need them at this stage of your computation?)

like image 132
Jeremy Friesner Avatar answered Sep 28 '22 15:09

Jeremy Friesner