I am trying to implement my own LRU cache. Yes, I know that Java provides a LinkedHashMap for this purpose, but I am trying to implement it using basic data structures.
From reading about this topic, I understand that I need a HashMap for O(1) lookup of a key and a linked list for management of the "least recently used" eviction policy. I found these references that all use a standard library hashmap but implement their own linked list:
The hash table is supposed to directly store a linked list Node as I show below. My cache should store Integer keys and String values.
However, in Java the LinkedList collection does not expose its internal nodes, so I can't store them inside the HashMap. I could instead have the HashMap store indices into the LinkedList, but then getting to an item would require O(N) time. So I tried to store a ListIterator instead.
import java.util.Map;
import java.util.HashMap;
import java.util.List;
import java.util.LinkedList;
import java.util.ListIterator;
public class LRUCache {
private static final int DEFAULT_MAX_CAPACITY = 10;
protected Map<Integer, ListIterator> _map = new HashMap<Integer, ListIterator>();
protected LinkedList<String> _list = new LinkedList<String>();
protected int _size = 0;
protected int _maxCapacity = 0;
public LRUCache(int maxCapacity) {
_maxCapacity = maxCapacity;
}
// Put the key, value pair into the LRU cache.
// The value is placed at the head of the linked list.
public void put(int key, String value) {
// Check to see if the key is already in the cache.
ListIterator iter = _map.get(key);
if (iter != null) {
// Key already exists, so remove it from the list.
iter.remove(); // Problem 1: ConcurrentModificationException!
}
// Add the new value to the front of the list.
_list.addFirst(value);
_map.put(key, _list.listIterator(0));
_size++;
// Check if we have exceeded the capacity.
if (_size > _maxCapacity) {
// Remove the least recently used item from the tail of the list.
_list.removeLast();
}
}
// Get the value associated with the key.
// Move value to the head of the linked list.
public String get(int key) {
String result = null;
ListIterator iter = _map.get(key);
if (iter != null) {
//result = iter
// Problem 2: HOW DO I GET THE STRING FROM THE ITERATOR?
}
return result;
}
public static void main(String argv[]) throws Exception {
LRUCache lruCache = new LRUCache(10);
lruCache.put(10, "This");
lruCache.put(20, "is");
lruCache.put(30, "a");
lruCache.put(40, "test");
lruCache.put(30, "some"); // Causes ConcurrentModificationException
}
}
So this leads to three problems:
Problem 1: I am getting a ConcurrentModificationException when I update the LinkedList using the iterator that I store in the HashMap.
Exception in thread "main" java.util.ConcurrentModificationException
at java.util.LinkedList$ListItr.checkForComodification(LinkedList.java:953)
at java.util.LinkedList$ListItr.remove(LinkedList.java:919)
at LRUCache.put(LRUCache.java:31)
at LRUCache.main(LRUCache.java:71)
Problem 2. How do I retrieve the value pointed to by the ListIterator? It seems I can only retrieve the next() value.
Problem 3. Is there any way to implement this LRU cache using the Java collections LinkedList, or do I really have to implement my own linked list?
How do you fix Java's ConcurrentModificationException? There are two basic approaches: Do not make any changes to a collection while an Iterator loops through it. If you can't stop the underlying collection from being modified during iteration, create a clone of the target data structure and iterate through the clone.
Iterators in java are used to iterate over the Collection objects. Fail-Fast iterators immediately throw ConcurrentModificationException if there is structural modification of the collection. Structural modification means adding, removing any element from collection while a thread is iterating over that collection.
ConcurrentModificationException is a predefined Exception in Java, which occurs while we are using Java Collections, i.e whenever we try to modify an object concurrently without permission ConcurrentModificationException occurs which is present in java. util package.
1) This isn't really what Iterators are for.
By contract, if you modify the list without using the iterator -- as you do here
_list.addFirst(value);
then ALL OPEN ITERATORS on that list should throw ConcurrentModificationException. They were open to a version of the list that no longer exists.
2) A LinkedList is not, exactly, a linked list of nodes. It's a java.util.List, whose backing implementation is a doubly linked list of nodes. That List contract is why it does not expose references to the backing implementation -- so operations like "remove this node, as a node, and move it to the head" are no good. This encapsulation is for your own protection (same as the concurrent mod exception) -- it allows your code to rely on the List semantics of a LinkedList (iterability, for example) without worry that some joker two cubes away was hacking away at its innards and broke the contract.
3) What you really need here is NOT a LinkedList. What you need is a Stack that that allows you to move any arbitrary entry to the head and dump the tail. You are implying that you want a fast seek time to an arbitrary entry and also a fast remove and a fast add, AND you want to be able to find the tail at any moment in case you need to remove it.
Fast seek time == HashSomething
Fast add/remove of arbitrary elements == LinkedSomething
Fast addressing of the final element == SomekindaList
4) You're going to need to build your own linking structure...or use a LinkedHashMap.
PS LinkedHashSet is cheating, it's implemented using a LinkedHashMap.
I'll deal with problem 3 first:
As you point out in your question, LinkedList
(like all well designed generic collections) hides the details of the implementation such as the nodes containing the links. In your case you need your hash map to reference these links directly as the values of the map. To do otherwise (e.g. having indirection through a third class) would defeat the purpose of an LRU cache to allow very low overhead on value access. But this is not possible with standard Java Collections - they don't (and shouldn't) provide direct access to internal structures.
So the logical conclusion of this is that, yes, you need to implement your own way of storing the order in which items in the cache have been used. That doesn't have to be a double-linked list. Those have traditionally been used for LRU caches because the most common operation is to move a node to the top of the list when it is accessed. That is an incredibly cheap operation in a double-linked list requiring just four nodes to be relinked with no memory allocation or free.
Problem 1 & 2:
Essentially the root cause here is that this you are trying to use iterators as a cursor. They are designed to be created, stepped through to perform some operation and then disposed of. Even if you get over the problems you are having I expect there will be further problems behind them. You're putting a square peg in a round hole.
So my conclusion is that you need to implement your own way to hold values in a class that keeps track of order of access. However it can be incredibly simple: only three operations are required: create, get value and remove from tail. Both create and get value must move the node to the head of the list. No inserting or deleting from the middle of the list. No deleting the head. No searching. Honestly dead simple.
Hopefully this will get you started :-)
public class <K,V> LRU_Map implements Map<K,V> {
private class Node {
private final V value;
private Node previous = null;
private Node next = null;
public Node(V value) {
this.value = value;
touch();
if (tail == null)
tail = this;
}
public V getValue() {
touch();
return value;
}
private void touch() {
if (head != this) {
unlink();
moveToHead();
}
}
private void unlink() {
if (tail == this)
tail = prev;
if (prev != null)
prev.next = next;
if (next != null)
next.prev = prev;
}
private void moveToHead() {
prev = null;
next = head;
head = this;
}
public void remove() {
assert this == tail;
assert this != head;
assert next == null;
if (prev != null)
prev.next = null;
tail = prev;
}
}
private final Map<K,Node> map = new HashMap<>();
private Node head = null;
private Node tail = null;
public void put(K key, V value) {
if (map.size() >= MAX_SIZE) {
assert tail != null;
tail.remove();
}
map.put(key, new Node(value));
}
public V get(K key) {
if (map.containsKey(key))
return map.get(key).getValue();
else
return null;
}
// and so on for other Map methods
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With