I need to store lots of data (Objects) in memory (for computations).
Since computations are done based on this data it is critical that all data will reside in the same JVM process memory.
Most data will be built from Strings, Integers and other sub-objects (Collections, HashSet, etc...).
Since Java's objects memory overhead is significant (Strings are UTF-16, each object has 8 bytes overhead) I'm looking for libraries which enable storing such data in memory with lower overhead.
I've read interesting articles about reducing memory:
* http://www.cs.virginia.edu/kim/publicity/pldi09tutorials/memory-efficient-java-tutorial.pdf
* http://blog.griddynamics.com/2010/01/java-tricks-reducing-memory-consumption.html
I was just wondering if there is some library for such scenarios out there or I'll need to start from scratch.
To understand better my requirement imagine a server which process high volume of records and need to analyze them based on millions of other records which are stored in memory (for high processing rate).
for collection overhead have a look at trove - their memory overhead is lower than the built-in Collections classes (especially for maps and sets which, in the JDK are based on maps).
if you have large objects it might be worthwhile to save them "serialized" as some compact binary representation (not java serialization) and deserialize back to a full-blown object when needed)
you could also use a cache library that can page out to disk? take a look at infinispan or ehcache. also, some of those libraries (ehcache among them, if memory serves) provide "off-heap storage" as part of your jvm process - a chunk of memory not subject to GC managed by the (native) library. if you have an efficient binary representation you could store it there (wont lower your footpring but might make GC behave better)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With