Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C++: container replacement for vector/deque for huge sizes

so my applications has containers with 100 million and more elements.

I'm on the hunt for a container which behaves - time-wise - better than std::deque (let alone std::vector) with respect to frequent insertions and deletions all over the container ... including near the middle. Access time to the n-th element does not need to be as fast as vector, but should definetely be better than full traversal like in std::list (which has a huge memory overhead per element anyway).

Elements should be treated ordered by index (like vector, deque, list), so std::set or std::unordered_set also do not work well.

Before I sit down and code such a container myself: has anyone seen such a beast already? I'm pretty sure the STL hasn't anything like this, looking to BOOST I did not find something I could use but I may be wrong.

Any hints?

like image 665
BaCh Avatar asked Oct 07 '22 02:10

BaCh


2 Answers

There's a whole STL replacement for big data, in case your app is centric to such data:

  • STXXL - http://stxxl.sourceforge.net/

edit: I was actually a bit fast to answer. 100 million is not really a large number. E.g., if each element is one byte, you could save it in a 96MiB array. So whether STXXL is any useful, the size of an element should be significantly bigger.

like image 70
Sebastian Mach Avatar answered Oct 15 '22 06:10

Sebastian Mach


I think you can get the performance characteristics that you want with a skip list:

https://en.wikipedia.org/wiki/Skip_list#Indexable_skiplist

It's the "indexable" part that you're interested in, of course -- you don't actually want the items to be sorted. So some modification is needed that I leave as an exercise.

You might find that 100 million list nodes begins to strain a 32 bit address space, but probably not an issue in 64 bits.

like image 43
Steve Jessop Avatar answered Oct 15 '22 07:10

Steve Jessop