I'm trying to implement LFU (Least Frequently Used) cache using pure STL (I don't want to use Boost!).
Requirements are:
Key
like with std::map
.UsesCount
attribute).UsesCount
) of any item.The problems are:
std::vector
as container of items (Key
, Value
, UsesCount
), std::map
as a container of iterators to the vector for associative access and std::make_heap
, std::push_heap
and std::pop_heap
as priority queue implementation within the vector, the itertors in the map are not valid after heap operations.std::list
(or std::map
) instead of std::vector
in the previous configuration, std::make_heap
etc. can't be compiled becasue their iterators does not support aritmetic.std::priority_queue
, I don't have ability to update item priority.The questions are:
Thank you for your insights.
Your make implementation using the *_heap
functions and a vector seems to be a good fit. although it will result in slow updates. The problem about iterator invalidation you encounter is normal for every container using a vector as an underlying data structure. This is the approach also taken by boost::heap::priority_queue, but it does not provide a mutable interface for the reason mentioned above. Other boost::heap data-structures offer the ability to update the heap.
Something that seems a little odd: Even if you would be able to use std::priority_queue
you will still face the iterator invalidation problem.
To answer your questions directly: You are not missing something obvious. std::priority_queue
is not as useful as it should be. The best approach is to write your own heap implementation that supports updates . To make it fully STL compatible (especially allocator aware) is rather tricky and not a simple task. On top of that, implement the LFU cache.
For the first step, look at the Boost implementations to get an idea of the effort. I'm not aware of any reference implementation for the second.
To work around iterator invalidation you can always, choose indirection into another container, although you should try to avoid it as it creates an additional cost and can get quite messy.
A somewhat simpler approach than keeping two data structures:
O(n)
)std::nth_element
to find the worst 10% (O(n)
)O(n log n)
)So, adding a new element to the cache is common case O(log n)
, worst case O(n log n)
, and amortized O(log n)
.
Removing the worst 10% might be a bit drastic in a LFU cache, because new entries have to make the top 90% or they're cut. Then again, if you only remove one element then new entries still need to get off the bottom before the next new entry, or they're cut, and they have less time to do so. So depending why LFU is the right caching strategy for you, my change to it might be the wrong strategy, or it might still be fine.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With