Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java, Recommended way to Persist HashMaps for permanent, Reliable storage?

I am working on a project where a lot of searches are being conducted on a very large set of data so I am realizing using a traditional database structure itsn't working as I need to read tables into a HashMap format, that stays in memory the entire time, to be able to run queries in the amount of time needed for the application performance.

I am wondering what the recommended process for peristing a HashMap is with regard to speed of retrieving them from their persistent state and regard to minimizing extra code needed (as now I am writing custom classes that read the necessary data from DB tables and then create a nested HashMap reflecting the data structure that I need it to be in to be searchable as quickly as possible. I am not sure if simply writing to a text file would be a proper way to do this with regards to making sure the data is preserved and not corrupted. Any advice is appreciated

like image 732
Rick Avatar asked Mar 17 '11 11:03

Rick


3 Answers

Have you considered using key-value databases (like Redis or Riak)?

like image 88
wesoly Avatar answered Nov 02 '22 21:11

wesoly


  1. Ehcache.
  2. disk-backed-map

The following post might also help you

recommend-a-fast-scalable-persistent-map-java

like image 21
Dead Programmer Avatar answered Nov 02 '22 20:11

Dead Programmer


If you are sticking content from your DB into a hash structure in order to speed up searches against DB content I think you're probably taking the wrong approach. I don't know what you're trying to do exactly but perhaps using an index like Lucene is appropriate? This is a mature and highly optimised index and will handle things like caching frequent queries in memory.

Alternatively take a look at BerkeleyDB which is basically a disk-backed hash DB. Also very fast. (Although note I believe Oracle may be requiring a license for this for some use-cases now).

The only caveats to Lucene and BerkeleyDB is that they will require some overhead to set up. So my last suggestion is Tokyo-Cabinet which is a pretty decent, very quick and very simple to use disk-backed hash. Basically just include the jar in your class path and use it like a HashMap:

import tokyocabinet.HDB;

....

String dir = "/path/to/my/dir/";
HDB hash = new HDB();

// open the hash for read/write, create if does not exist on disk
if (!hash.open(dir + "unigrams.tch", HDB.OWRITER | HDB.OCREAT)) {
    throw new IOException("Unable to open " + dir + "unigrams.tch: " + hash.errmsg());
}

// Add something to the hash
hash.put("blah", "my string");

// Close it
hash.close();

And that's it. Anything you stick in the hash is persisted to disk, and can be reloaded later. And don't worry about the speed, in-memory optimisations are handled for you behind the scenes.

Edit: It looks like Tokyo Cabinet has been superceded by Kyoto Cabinet.

Edit 2: You don't say what DB you're using, but if MySQL does full text search not work for you?

like image 35
Richard H Avatar answered Nov 02 '22 21:11

Richard H