Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Looking for a drop-in replacement for a java.util.Map

Problem

Following up on this question, it seems that a file- or disk-based Map implementation may be the right solution to the problems I mentioned there. Short version:

  • Right now, I have a Map implemented as a ConcurrentHashMap.
  • Entries are added to it continually, at a fairly fixed rate. Details on this later.
  • Eventually, no matter what, this means the JVM runs out of heap space.

At work, it was (strongly) suggested that I solve this problem using SQLite, but after asking that previous question, I don't think that a database is the right tool for this job. So - let me know if this sounds crazy - I think a better solution would be a Map stored on disk.

Bad idea: implement this myself. Better idea: use someone else's library! Which one?

Requirements

Must-haves:

  • Free.
  • Persistent. The data needs to stick around between JVM restarts.
  • Some sort of searchability. Yes, I need the ability to retrieve this darn data as well as put it away. Basic result set filtering is a plus.
  • Platform-independent. Needs to be production-deployable on Windows or Linux machines.
  • Purgeable. Disk space is finite, just like heap space. I need to get rid of entries that are n days old. It's not a big deal if I have to do this manually.

Nice-to-haves:

  • Easy to use. It would be great if I could get this working by the end of the week.
    Better still: the end of the day. It would be really, really great if I could add one JAR to my classpath, change new ConcurrentHashMap<Foo, Bar>(); to new SomeDiskStoredMap<Foo, Bar>();
    and be done.
  • Decent scalability and performance. Worst case: new entries are added (on average) 3 times per second, every second, all day long, every day. However, inserts won't always happen that smoothly. It might be (no inserts for an hour) then (insert 10,000 objects at once).

Possible Solutions

  • Ehcache? I've never used it before. It was a suggested solution to my previous question.
  • Berkeley DB? Again, I've never used it, and I really don't know anything about it.
  • Hadoop (and which subproject)? Haven't used it. Based on these docs, its cross-platform-readiness is ambiguous to me. I don't need distributed operation in the foreseeable future.
  • A SQLite JDBC driver after all?
  • ???

Ehcache and Berkeley DB both look reasonable right now. Any particular recommendations in either direction?

like image 646
Matt Ball Avatar asked Jan 18 '11 16:01

Matt Ball


1 Answers

UPDATE (some 4 years after first post...): beware that in newer versions of ehcache, persistence of cache items is available only in the pay product. Thanks @boday for pointing this out.

ehcache is great. It will give you the flexibility you need to implement the map in memory, disk or memory with spillover to disk. If you use this very simple wrapper for java.util.Map then using it is blindingly simple:

import java.util.Collection;
import java.util.List;
import java.util.Map;
import java.util.Set;

import net.sf.ehcache.Cache;
import net.sf.ehcache.Element;

import org.apache.log4j.Logger;

import com.google.common.collect.Sets;

public class EhCacheMapAdapter<K,V> implements Map<K,V> {
    @SuppressWarnings("unused")
    private final static Logger logger = Logger
            .getLogger(EhCacheMapAdapter.class);

    public Cache ehCache;

    public EhCacheMapAdapter(Cache ehCache) {
        super();
        this.ehCache = ehCache;
    } // end constructor

    @Override
    public void clear() {
        ehCache.removeAll();
    } // end method

    @Override
    public boolean containsKey(Object key) {
        return ehCache.isKeyInCache(key);
    } // end method

    @Override
    public boolean containsValue(Object value) {
        return ehCache.isValueInCache(value);
    } // end method

    @Override
    public Set<Entry<K, V>> entrySet() {
        throw new UnsupportedOperationException();
    } // end method

    @SuppressWarnings("unchecked")
    @Override
    public V get(Object key) {
        if( key == null ) return null;
        Element element = ehCache.get(key);
        if( element == null ) return null;
        return (V)element.getObjectValue();
    } // end method

    @Override
    public boolean isEmpty() {
        return ehCache.getSize() == 0;
    } // end method

    @SuppressWarnings("unchecked")
    @Override
    public Set<K> keySet() {
        List<K> l = ehCache.getKeys();
        return Sets.newHashSet(l);
    } // end method

    @SuppressWarnings("unchecked")
    @Override
    public V put(K key, V value) {
        Object o = this.get(key);
        if( o != null ) return (V)o;
        Element e = new Element(key,value);
        ehCache.put(e);
        return null;
    } // end method


    @Override
    public V remove(Object key) {
        V retObj = null;
        if( this.containsKey(key) ) {
            retObj = this.get(key);
        } // end if
        ehCache.remove(key);
        return retObj;
    } // end method

    @Override
    public int size() {
        return ehCache.getSize();
    } // end method

    @Override
    public Collection<V> values() {
        throw new UnsupportedOperationException();
    } // end method

    @Override
    public void putAll(Map<? extends K, ? extends V> m) {
        for( K key : m.keySet() ) {
            this.put(key, m.get(key));
        } // end for
    } // end method
} // end class
like image 109
harschware Avatar answered Sep 20 '22 15:09

harschware