Suppose, I have a very large std::map< unsigned int, Foo > FooDB
, which holds Foo
objects in memory, retrievable by their ID. Now there might be more Foo
objects than there is memory available to store them. So I'd like to have the following construct:
Foo
object with ID x from FooDB
FooDB
, return itFooDB
for further queries
FooDB
FooDB
objects not in use (oldest query timestamp)I'd like to reserve some memory for the FooDB
and I can't tell, how many Foo
objects can be stored in it, as they differ in size.
Any ideas on how to implement this?
EDIT
My basic problem is: how can I tell a std::map
's size in memory? All heap objects stored in it included, of course. How can I know when the not enough memory part has been reached?
So the standard way to implement cache is to have a data structure, using which we can access value by a given key in constant time. Now all good, we can save key value pairs in memory and retrieve it whenever we need it.
Caching is a separate memory, yes, and it has it's own addresses, but when the CPU caches memory lines from RAM, it keeps a record of what RAM-addresses the memory was on, and keeps a map between RAM-address and cache-address. From the point of view of your program, the address is the same.
An LRU cache is an efficient cache data structure that can be used to figure out what we should evict when the cache is full.
As far as I know, there's no way to ask an object what its size is, other than sizeof(). You said that sizeof() won't work because the Foo objects don't have a fixed size. In that case, if you can modify Foo, then maybe your Foo class can keep track of its memory footprint internally. And if you can't modify Foo, you might be able to write an external function that can deduce the memory footprint.
Fundamentally, it would be very difficult for the language/compiler/runtime to know how big a dynamically-sized object is because it doesn't know which allocations belong to the object. A simple solution, just recursively sum all of the things that it's members point to, will fail on anything that has a pointer to an object that it doesn't "own". Another simple solution, to keep track of all of the allocations done between when the constructor starts and when it returns, will fail for anything that makes allocations after the constructor is called.
You might want to just use the number of Foo's as your cache limit instead of the memory size. Unless you know a lot about the memory availability and usage of the entire system, a cap based on memory size would be arbitrary as well. And if you know a lot about the memory usage of the entire system, you could just use the overall memory availability to determine when to release objects from the cache.
It's fairly straightforward.
Just place a reference to each in-memory Foo instance in FooDB in a sorted linked list ordered by age.
When you load a new item for the first time into memory add it to the front of the list.
When you read/modify an item move it from the middle of list to the front of the list.
When you need to delete an old item to make space pop it off the back of the list.
For example:
typedef shared_ptr<Foo> PFoo;
class Foo
{
...
list<PFoo>::iterator age;
};
typedef map< unsigned int, PFoo > FooDB;
FooDB foodb;
list<PFoo> ages;
void LoadFoo(PFoo foo)
{
ages.push_front(foo);
}
void ReadFoo(PFoo foo)
{
...
ages.erase(foo->age);
ages.push_front(foo);
}
void MakeSpace()
{
PFoo foo = ages.back();
ages.pop_back();
DeleteFoo(foo);
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With