Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I implement caching in C++?

Suppose, I have a very large std::map< unsigned int, Foo > FooDB, which holds Foo objects in memory, retrievable by their ID. Now there might be more Foo objects than there is memory available to store them. So I'd like to have the following construct:

  • retrieve Foo object with ID x from FooDB
  • if object x is in FooDB, return it
  • if it isn't, load it from HD, try to store it in FooDB for further queries
    • there is enough memory available: add it to FooDB
    • there is not enough memory: free some space by removing from FooDB objects not in use (oldest query timestamp)

I'd like to reserve some memory for the FooDB and I can't tell, how many Foo objects can be stored in it, as they differ in size.

Any ideas on how to implement this?

EDIT

My basic problem is: how can I tell a std::map's size in memory? All heap objects stored in it included, of course. How can I know when the not enough memory part has been reached?

like image 462
Ben Avatar asked Aug 27 '12 14:08

Ben


People also ask

How do you implement caching?

So the standard way to implement cache is to have a data structure, using which we can access value by a given key in constant time. Now all good, we can save key value pairs in memory and retrieve it whenever we need it.

What is caching in C?

Caching is a separate memory, yes, and it has it's own addresses, but when the CPU caches memory lines from RAM, it keeps a record of what RAM-addresses the memory was on, and keeps a map between RAM-address and cache-address. From the point of view of your program, the address is the same.

Which data structure is best for caching?

An LRU cache is an efficient cache data structure that can be used to figure out what we should evict when the cache is full.


2 Answers

As far as I know, there's no way to ask an object what its size is, other than sizeof(). You said that sizeof() won't work because the Foo objects don't have a fixed size. In that case, if you can modify Foo, then maybe your Foo class can keep track of its memory footprint internally. And if you can't modify Foo, you might be able to write an external function that can deduce the memory footprint.

Fundamentally, it would be very difficult for the language/compiler/runtime to know how big a dynamically-sized object is because it doesn't know which allocations belong to the object. A simple solution, just recursively sum all of the things that it's members point to, will fail on anything that has a pointer to an object that it doesn't "own". Another simple solution, to keep track of all of the allocations done between when the constructor starts and when it returns, will fail for anything that makes allocations after the constructor is called.

You might want to just use the number of Foo's as your cache limit instead of the memory size. Unless you know a lot about the memory availability and usage of the entire system, a cap based on memory size would be arbitrary as well. And if you know a lot about the memory usage of the entire system, you could just use the overall memory availability to determine when to release objects from the cache.

like image 85
Tom Panning Avatar answered Sep 22 '22 09:09

Tom Panning


It's fairly straightforward.

Just place a reference to each in-memory Foo instance in FooDB in a sorted linked list ordered by age.

When you load a new item for the first time into memory add it to the front of the list.

When you read/modify an item move it from the middle of list to the front of the list.

When you need to delete an old item to make space pop it off the back of the list.

For example:

typedef shared_ptr<Foo> PFoo;

class Foo
{
    ...
    list<PFoo>::iterator age;
};

typedef map< unsigned int, PFoo > FooDB;
FooDB foodb; 

list<PFoo> ages;

void LoadFoo(PFoo foo)
{
    ages.push_front(foo);
}

void ReadFoo(PFoo foo)
{
    ...
    ages.erase(foo->age);
    ages.push_front(foo);
}

void MakeSpace()
{
    PFoo foo = ages.back();
    ages.pop_back();
    DeleteFoo(foo);
}
like image 20
Andrew Tomazos Avatar answered Sep 22 '22 09:09

Andrew Tomazos