Fastest C++ map?

Tags:

Correct me I'm wrong but std::map is an ordered map, thus each time I insert a value the map uses an algorithm to sort its items internally, which takes some time.

My application gets information regarding some items on a constant interval.

This app keeps a map which is defined like this:

::std::map<DWORD, myItem*>

At first all items are considered "new" to the app. An "Item" object is being allocated and added to this map, associating its id and a pointer to it.

When it's not a "new" item (just an update of this object) my app should find the object at the map, using the given id, and update.

Most of the times I get updates.

My question is:
Is there any faster map implementation or should I keep using this one?
Am I better use unordered_map?

570

asked Jul 07 '10 19:07

Poni

3 Answers

Am I better use unordered_map?

Possibly.

std:map provides consistent performance at O(log n) because it needs to be implemented as a balanced tree. But std:unordered_map will be implemented as a hash table which might give you O(1) performance (good hash function and distribution of keys across hash buckets), but it could be O(n) (everything in one hash bucket and devolves to a list). One would normally expect something inbetween these extremes.

So you can have reasonable performance (O(log n)) all the time, or you need to ensure everything lines up to get good performance with a hash.

As with any such question: you need to measure before committing to one approach. Unless your datasets are large you might find there is no significant difference.

165

answered Oct 21 '22 10:10

Richard

Important warning: Unless you have measured (and your question suggests that you haven't) that map performance substantially influences your application performance (large percentage of time is spent on searching and updating the map) don't bother with making it faster. Stick to std::map (or std::unordered_map or any available hash_map implementation). Speeding up your application by 1% probably will not be worth the effort. Make it bug free instead.

Echoing Richard's answer: measure performance with different map implementation using your real classes and real data.

Some additional notes:

Understand the difference between expected cost (hash maps usually have it lower), worst case cost (O(logn) for balanced binary tree but much higher for hash map if insert triggers reallocation of hash array) and amortized cost (total cost divided by number of operations or elements; depends on things like ratio of new and existing elements). You need to find out which is more constraining in your case. For example reallocating of hash maps can be too much if you need to adhere to very low latency limit.
Find out where real bottleneck is. It might be that cost of searching in map is insignificant compared to e.g. IO cost.
Try more specialized map implementation. For example a lot can be gained if you know something more about map's key. Authors of generic map implementations do not have such knowledge.

In your example (32 bit unsigned integer keys which strongly cluster, e.g. are assigned sequentially) you can use radix based approach. Very simple example (threat it as an illustration, not ready to use recipe):

Item *sentinel[65536];  // sentinel page, initialized to NULLs.
Item (*pages[65536])[65536];  // list of pages,
                              // initialized so every element points to sentinel

Then search is as simple as:

Item *value = pages[index >> 16][index & 0xFFFF];

When you need to set new value:

if (pages[index >> 16] == sentinel) {
  pages[index >> 16] = allocate_new_null_filled_page();
}
pages[index >> 16][index & 0xFFFF] = value;

Tweak your map implementation.
- E.g. every hash_map likes to know approximate number of elements in advance. It helps avoid unnecessary reallocation of hash table and (possibly) rehashing of all keys.
- With my specialized example above you certainly would try different page sizes, or three level version.
- Common optimization is providing specialized memory allocator to avoid multiple allocations of small objects.

answered Oct 21 '22 08:10

Tomek Szpakowicz

Whenever you insert or delete item, the memory allocation/deallocation costs a lot. Instead you can use an allocator like this one: https://github.com/moya-lang/Allocator which speeds up std::map twice as author says, but I found it even faster especially for other STL containers.

answered Oct 21 '22 10:10

no one special

Related questions
                            
                                Can I select a specific block of text in a UITextField?
                            
                                How to cast an ActiveRecord object to another class when using STI?
                            
                                Can a valid Unicode string contain FFFF? Is Java/CharacterIterator broken?
                            
                                String immutability in C#
                            
                                C++11 scope exit guard, a good idea?
                            
                                Using a MemoryStream with FileStreamResult possible? [duplicate]
                            
                                JavaScript: JSLint throws "Read Only
                            
                                Efficiently convert a SAS dataset into a CSV
                            
                                Switch statement with multiple constant-expression in c#. Is it possible? [duplicate]
                            
                                Create SVG / XML document without ns0 namespace using Python ElementTree [duplicate]
                            
                                In linux terminal, what is the command that'll tell me in which folder am I?
                            
                                generate backup file using date and time as filename

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Fastest C++ map?

Tags:

c++

performance

data-structures

map