Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Recommended low memory hashmap for implementation for Java

I am currently working on a programming related problem where I am attempted to make a massive hashmap of data. The key for the data is a custom low-memory implementation of a CharSequence that implements hashCode() and equals(...) and the value is am Integer object.

There may be millions of entries in this hashtable and I managed to drastically reduce memory use for the value by having the Integer be a pointer in a file to the data I wish to hash but thbe problem is that the key may be tens of bytes (on average 25 bytes) and that the keys need to be held in memory in the default implementation of HashMap.

I need a hashmap that has a low memory overhead and that can possibly page the keys to disk or alternatively store a hashed representation of the keys. If the keys are themselves hashed then I would be concerned about hash collisions.

Ideally, I would like to be able to store a million entries in the map per 50MB of heap space (one byte array of 25 bytes in the key and Integer object in the value part).

Does anyone have any experience with low-memory filesystem-backed Maps that are optimised for reducing the footprint of the keys?

Thanks,

Chris

like image 734
Chris Avatar asked Mar 05 '10 06:03

Chris


2 Answers

You could use Java's hash map and write a FileKey class that takes a RandomAccessFile, offset and length, precomputes the hash at construction and which implements Comparable by reading the data from the file just for the compare.

In conjunction with a simple MRU cache, you could keep some number of keys in memory using another hashmap which is keyed on the same keys, but which uses a custom comparator which compares just the offset and length values (not the file data).

like image 82
Lawrence Dol Avatar answered Nov 08 '22 12:11

Lawrence Dol


How about Berkeley DB Java Edition? Its StoredMap class looks like what you are looking for.

like image 38
Kai Chan Avatar answered Nov 08 '22 10:11

Kai Chan