I have implemented the design a LRU Cache Problem on LeetCode using the conventional method (Doubly Linked List+Hash Map). For those unfamiliar with the problem, this implementation looks something like this:
I understand why this method is used (quick removal/insertion at both ends, fast access in the middle). What I am failing to understand is why someone would use both a HashMap and a LinkedList when one could simply use a array-based deque (in Java ArrayDeque, C++ simply deque). This deque allows for ease of insertion/deletion at both ends, and quick access in the middle which is exactly what you need for an LRU cache. You also would use less space because you wouldn't need to store a pointer to each node.
Is there a reason why the LRU cache is almost universally designed (on most tutorials at least) using the latter method as opposed to the Deque/ArrayDeque method? Would the HashMap/LinkedList method have any benefits?
When an LRU cache is full, we discard the Least Recently Used item.
If we're discarding items from the front of the queue, then, we have to make sure the item at the front is the one that hasn't been used for the longest time.
We ensure this by making sure that an item goes to the back of the queue whenever it is used. The item at the front is then the one that hasn't been moved to the back for the longest time.
To do this, we need to maintain the queue on every put
OR get
operation:
When we put
a new item in the cache, it becomes the most recently used item, so we put it at the back of the queue.
When we get
an item that is already in the cache, it becomes the most recently used item, so we move it from its current position to the back of the queue.
Moving items from the middle to the end is not a deque operation and is not supported by the ArrayDeque
interface. It's also not supported efficiently by the underlying data structure that ArrayDeque
uses. Doubly-linked lists are used because they do support this operation efficiently.
The purpose of an LRU cache is to support two operations in O(1)
time: get(key)
and put(key, value)
, with the additional constraint that least recently used keys are discarded first. Normally the keys are the parameters of a function call and the value is the cached output of that call.
Regardless of how you approach this problem we can agree that you MUST use a hashmap. You need a hashmap to map a key already present in the cache to the value in O(1)
.
In order to deal with the additional constraint of least recently used keys being discarded first you can use a LinkedList or ArrayDeque. However since we don't actually need to access the middle, a LinkedList is better since you don't need to resize.
Edit:
Mr. Timmermans discussed in his answer why ArrayDeques
cannot be used in an LRU cache
due to the necessity of moving elements from the middle to the end. With that being said here is an implementation of an LRU cache
that successfully submits on leetcode using only appends and poplefts in the deque
. Note that python's collections.deque
is implemented as a doubly linked list, however we are only using operations in collections.deque
that are also O(1)
in a circular array, so the algorithm stays the same regardless.
from collections import deque
class LRUCache:
def __init__(self, capacity: 'int'):
self.capacity = capacity
self.hashmap = {}
self.deque = deque()
def get(self, key: 'int') -> 'int':
res = self.hashmap.get(key, [-1, 0])[0]
if res != -1:
self.put(key, res)
return res
def put(self, key: 'int', value: 'int') -> 'None':
self.add(key, value)
while len(self.hashmap) > self.capacity:
self.remove()
def add(self, key, value):
if key in self.hashmap:
self.hashmap[key][1] += 1
self.hashmap[key][0] = value
else:
self.hashmap[key] = [value, 1]
self.deque.append(key)
def remove(self):
k = self.deque.popleft()
self.hashmap[k][1] -=1
if self.hashmap[k][1] == 0:
del self.hashmap[k]
I do agree with Mr. Timmermans that using the LinkedList approach is preferable - but I want to highlight that using an ArrayDeque
to build an LRU cache is possible.
The main mixup between myself and Mr. Timmermans is how we interpreted capacity. I took capacity to mean caching the last N
get / put requests, while Mr. Timmermans took it to mean caching the last N
unique items.
The above code does have a loop in put
which slows the code down - but this is just to get the code to conform to caching the last N
unique items. If we had the code cache the last N
requests instead, we could replace the loop with:
if len(self.deque) > self.capacity: self.remove()
This will make it as fast if not faster than the linked-list variant.
Regardless of what maxsize
is interpreted as, the above method still works as an LRU
cache - least recently used elements get discarded first.
I just want to highlight that the designing an LRU
cache in this manner is possible. The source is right there - try to submit it on Leetcode!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With