Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

map vs multimap in C++ (performance)

Tags:

c++

dictionary

I am working on data structure where input is very large almost 1 TB. I need to load data into associative container.

Data has some duplicate entires so i am using multimap but someone suggested me to use map of vector instead of using this. May i know what is the difference performance wise?

 map<const char*, const char*, cmptr> mulmap;

 map <const char*, vector <const char*> ,cmptr> mmap;
like image 238
Manish Avatar asked Feb 18 '13 08:02

Manish


1 Answers

You are wasting your time thinking about map versus multimap. Suppose that the number of bins is N and the average number of items per bin is M.

A std::multimap<Key, Val> typically uses an RB tree with duplicate keys.

  • Fetch is O(log N + log M)
  • Insert is O(log N + log M)
  • Delete is O(log N + log M)
  • Iteration is O(1)

A std::map<Key, std::vector<Val>> typically uses an RB tree with unique keys.

  • Fetch is O(log N)
  • Insert is O(log N)
  • Delete is O(log N)
  • Iteration is O(1)

As you can see, the difference is not worth talking about unless M is very large.

However, the storage of both is limited by RAM. 1 TB is simply not feasible for most systems, and no motherboard I've heard of supports it.

You are better off using a database for 1 TB of data. You can choose almost any database for this task. Kyoto Cabinet is simple and does what you want, but you can also use PostgreSQL, MySQL, Sqlite, Dynamo, Redis, MongoDB, Cassandra, Voldemort...

like image 139
Dietrich Epp Avatar answered Sep 18 '22 08:09

Dietrich Epp