I am working on data structure where input is very large almost 1 TB. I need to load data into associative container.
Data has some duplicate entires so i am using multimap but someone suggested me to use map of vector instead of using this. May i know what is the difference performance wise?
map<const char*, const char*, cmptr> mulmap;
map <const char*, vector <const char*> ,cmptr> mmap;
You are wasting your time thinking about map
versus multimap
. Suppose that the number of bins is N and the average number of items per bin is M.
A std::multimap<Key, Val>
typically uses an RB tree with duplicate keys.
A std::map<Key, std::vector<Val>>
typically uses an RB tree with unique keys.
As you can see, the difference is not worth talking about unless M is very large.
However, the storage of both is limited by RAM. 1 TB is simply not feasible for most systems, and no motherboard I've heard of supports it.
You are better off using a database for 1 TB of data. You can choose almost any database for this task. Kyoto Cabinet is simple and does what you want, but you can also use PostgreSQL, MySQL, Sqlite, Dynamo, Redis, MongoDB, Cassandra, Voldemort...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With