Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is a data structure kind of like a hash table, but infrequently-used keys are deleted?

I am looking for a data structure that operates similar to a hash table, but where the table has a size limit. When the number of items in the hash reaches the size limit, a culling function should be called to get rid of the least-retrieved key/value pairs in the table.

Here's some pseudocode of what I'm working on:

class MyClass {
  private Map<Integer, Integer> cache = new HashMap<Integer, Integer>();
  public int myFunc(int n) {
    if(cache.containsKey(n))
      return cache.get(n);
    int next = . . . ; //some complicated math.  guaranteed next != n.
    int ret = 1 + myFunc(next);
    cache.put(n, ret);
    return ret;
  }
}

What happens is that there are some values of n for which myFunc() will be called lots of times, but many other values of n which will only be computed once. So the cache could fill up with millions of values that are never needed again. I'd like to have a way for the cache to automatically remove elements that are not frequently retrieved.

This feels like a problem that must be solved already, but I'm not sure what the data structure is that I would use to do it efficiently. Can anyone point me in the right direction?


Update I knew this had to be an already-solved problem. It's called an LRU Cache and is easy to make by extending the LinkedHashMap class. Here is the code that incorporates the solution:

class MyClass {
  private final static int SIZE_LIMIT = 1000;
  private Map<Integer, Integer> cache =
    new LinkedHashMap<Integer, Integer>(16, 0.75f, true) {
      protected boolean removeEldestEntry(Map.Entry<Integer, Integer> eldest)
      {
        return size() > SIZE_LIMIT;
      }
  };
  public int myFunc(int n) {
    if(cache.containsKey(n))
      return cache.get(n);
    int next = . . . ; //some complicated math.  guaranteed next != n.
    int ret = 1 + myFunc(next);
    cache.put(n, ret);
    return ret;
  }
}
like image 561
Kip Avatar asked Nov 07 '08 16:11

Kip


People also ask

What is a hash table data structure?

Hash Table is a data structure which stores data in an associative manner. In a hash table, data is stored in an array format, where each data value has its own unique index value. Access of data becomes very fast if we know the index of the desired data.

What is a hash table a structure that maps values to keys?

A hash table, also known as a hash map, is a data structure that maps keys to values. It is one part of a technique called hashing, the other of which is a hash function. A hash function is an algorithm that produces an index of where a value can be found or stored in the hash table.

What is a hash table and when do you use it?

A hash table is a data structure that is used to store keys/value pairs. It uses a hash function to compute an index into an array in which an element will be inserted or searched. By using a good hash function, hashing can work well.

What is the key in a Hashtable?

Hashtable stores key/value pair in hash table. In Hashtable we specify an object that is used as a key, and the value we want to associate to that key. The key is then hashed, and the resulting hash code is used as the index at which the value is stored within the table.


3 Answers

You are looking for an LRUList/Map. Check out LinkedHashMap:

The removeEldestEntry(Map.Entry) method may be overridden to impose a policy for removing stale mappings automatically when new mappings are added to the map.

like image 187
ReneS Avatar answered Nov 15 '22 12:11

ReneS


Googling "LRU map" and "I'm feeling lucky" gives you this:

http://commons.apache.org/proper/commons-collections//javadocs/api-release/org/apache/commons/collections4/map/LRUMap.html

A Map implementation with a fixed maximum size which removes the least recently used entry if an entry is added when full.

Sounds pretty much spot on :)

like image 23
The Archetypal Paul Avatar answered Nov 15 '22 13:11

The Archetypal Paul


WeakHashMap will probably not do what you expect it to... read the documentation carefully and ensure that you know exactly what you from weak and strong references.

I would recommend you have a look at java.util.LinkedHashMap and use its removeEldestEntry method to maintain your cache. If your math is very resource intensive, you might want to move entries to the front whenever they are used to ensure that only unused entries fall to the end of the set.

like image 20
kasperjj Avatar answered Nov 15 '22 14:11

kasperjj