Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

mmap-loadable data structure library for C++ (or C)

I have a some large data structure (N > 10,000) that usually only needs to be created once (at runtime), and can be reused many times afterwards, but it needs to be loaded very quickly. (It is used for user input processing on iPhoneOS.) mmap-ing a file seems to be the best choice.

Are there any data structure libraries for C++ (or C)? Something along the line

ReadOnlyHashTable<char, int> table ("filename.hash");
// mmap(...) inside the c'tor
...
int freq = table.get('a');
...
// munmap(...); inside the d'tor.

Thank you!


Details:

I've written a similar class for hash table myself but I find it pretty hard to maintain, so I would like to see if there's existing solutions already. The library should

  • Contain a creation routine that serialize the data structure into file. This part doesn't need to be fast.
  • Contain a loading routine that mmap a file into read-only (or read-write) data structure that can be usable within O(1) steps of processing.
  • Use O(N) amount of disk/memory space with a small constant factor. (The device has serious memory constraint.)
  • Small time overhead to accessors. (i.e. the complexity isn't modified.)

Assumptions:

  • Bit representation of data (e.g. endianness, encoding of float, etc.) does not matter since it is only used locally.
  • So far the possible types of data I need are integers, strings, and struct's of them. Pointers do not appear.

P.S. Can Boost.intrusive help?

like image 392
kennytm Avatar asked Feb 20 '10 09:02

kennytm


People also ask

Which data structure is used by map in c++?

Map: C++ Map is another commonly used STL container. The map is an ordered data structure that holds the data in an ordered or sorted form so that elements can easily be looked up in this dictionary-like data structure.

Does C have stack library?

The pop and return external functions are provided with the argument stack library. The pop external functions are described below according to the data type of the value that each pops from the argument stack. The return external functions are described in Return functions for C.

Which data structure is used for implementing map?

The internal implementation of map is self-balancing Binary Tree. There are generally two methods for implementing a self-balancing binary tree: Red-Black Tree and. AVL Trees.


1 Answers

You could try to create a memory mapped file and then create the STL map structure with a customer allocator. Your customer allocator then simply takes the beginning of the memory of the memory mapped file, and then increments its pointer according to the requested size. In the end all the allocated memory should be within the memory of the memory mapped file and should be reloadable later.

You will have to check if memory is free'd by the STL map. If it is, your customer allocator will lose some memory of the memory mapped file but if this is limited you can probably live with it.

like image 131
Patrick Avatar answered Oct 12 '22 16:10

Patrick