I did some profiling on my application and one of the results turned out that about 18% of memory on the heap is used by objects of type Double
. It turns out these objects are the values in Map
s, where I cannot use the primitive type.
My reasoning is that the primitive type of double
consumes less memory than it's object Double
. Is there a way to have a map like data structure, that would accept any type as key and a primitive double
as values?
Main operations would be:
Typical maps that I have are:
HashMap<T, HashMap<NodeData<T>, Double>> graph
HashMap<Point2D, Boolean> onSea
(though not a double value)ConcurrentHashMap<Point2D, HashMap<Point2D, Double>>
All used with Java 8.
Addendum
I am mainly not interested in frameworks that have a solution to these type of maps, but on what has to be considered when addressing these issues. If you want, what are the concepts/ideas/approaches behind any such framework. Or the solution may be also on another level, where the maps are replaced with objects following a certain pattern like @Ilmari Karonen pointed out in his answer.
ArrayList stores the elements only as values and maintains internally the indexing for every element. While HashMap stores elements with key and value pairs that means two objects. So HashMap takes more memory comparatively.
The keys and values of a map can be any reference type. We can't use primitive types because of a restriction around the way generics were designed. A HashMap allows one null key and multiple null values. It doesn't preserve the order of the elements and doesn't guarantee the order will remain the same over time.
HashMap is a general purpose Map implementation. It provides a performance of O(1) , while TreeMap provides a performance of O(log(n)) to add, search, and remove items. Hence, HashMap is usually faster.
As we know, the core Java Map s don't allow the storage of primitive keys or values. That's why we'll introduce some external third-party libraries that provide primitive map implementations. 2. Eclipse Collections
The IntIntMaps factory class is the most convenient way to create primitive maps. It allows us to create both mutable and immutable instances of the desired type of map.
It turns out these objects are the values in Map s, where I cannot use the primitive type. My reasoning is that the primitive type of double consumes less memory than it's object Double. Is there a way to have a map like data structure, that would accept any type as key and a primitive double as values?
But if that happens to be the case for your keys, or if you can arrange for it to be the case, arrays of primitive values can be very efficient. In particular, if you can assign a unique small integer ID to each key object, then you can use that ID as an index into an array.
Others have already suggested several third-party implementations of primitive-valued maps. For completeness, I'd like to mention some ways to get rid of the maps entirely that you might want to consider. These solutions won't always be possible, but when they are, they will usually be both faster and more memory-efficient than any map can be.
A simple double[]
array may not be as sexy as a fancy map, but very little can beat it in compactness and speed of access.
Of course, arrays come with a bunch of limitations: their size is fixed (although you can always create a new array and copy the old one's content into it) and their keys can only be small positive integers which, for efficiency, should be reasonably dense (i.e. the total number of used keys should be a reasonably large fraction of the highest key value). But if that happens to be the case for your keys, or if you can arrange for it to be the case, arrays of primitive values can be very efficient.
In particular, if you can assign a unique small integer ID to each key object, then you can use that ID as an index into an array. Similarly, if you're already storing your objects in an array (e.g. as part of some more complex data structure) and looking them up by index, then you can simply use the same index to look up any additional metadata values in another array.
You could even dispense with the ID uniqueness requirement, if you implemented some kind of a collision handling mechanism, but at that point you're well on your way towards implementing your own hash table. In some cases that might actually make sense, but usually at that point it's probably easier to use an existing third-party implementation.
Instead of maintaining a map from key objects into primitive values, why not just make those values into properties of the objects themselves? This is, after all, what object-oriented programming is all about — grouping related data into meaningful objects.
For example, instead of maintaining a HashMap<Point2D, Boolean> onSea
, why not just give your points a boolean onSea
property? Of course, you'll need to define your own custom point class for this, but there's no reason why you can't make it extend the standard Point2D
class if you want, so that you can pass your custom points into any method that expects a Point2D
.
Again, this approach may not always work directly, e.g. if you need to work with classes that you cannot modify (but see below), or if the values you want to store are associated with more than one object (as in your ConcurrentHashMap<Point2D, HashMap<Point2D, Double>>
).
However, for the latter case, you may still be able to solve the problem by suitably redesigning your data representation. For example, instead of representing a weighted graph as a Map<Node, Map<Node, Double>>
, you can define an Edge
class like:
class Edge { Node a, b; double weight; }
and then add an Edge[]
(or Vector<Edge>
) property to each node that contains any edges connected to that node.
If you have multiple maps with the same keys, and cannot just turn the values into new properties of the key objects as suggested above, consider grouping them into a single metadata class and creating a single map from the keys into objects of that class. For example, instead of a Map<Item, Double> accessFrequency
and a Map<Item, Long> creationTime
, consider defining a single metadata class like:
class ItemMetadata { double accessFrequency; long creationTime; }
and having a single Map<Item, ItemMetadata>
to store all the metadata values. This is more memory-efficient than having multiple maps, and may also save time by avoiding redundant map lookups.
In some cases, for convenience, you may also wish to include in each metadata object a reference to its corresponding main object, so that you can access both through a single reference to the metadata object. Which naturally segues into...
As a combination of the previous two alternatives, if you cannot directly add extra metadata properties into the key objects, consider instead wrapping them with decorators that can hold the extra values. Thus, for example, instead of directly creating your own point class with extra properties, you could simply do something like:
class PointWrapper { Point2D point; boolean onSea; // ... }
If you like, you can even turn this wrapper into a full-blown decorator by implementing method forwarding, but even just a simple "dumb" wrapper may be sufficient for many purposes.
This approach is most useful if you can then arrange to store and work with just the wrappers, so that you never need to look up the wrapper corresponding to an unwrapped object. Of course, if you do need to do that occasionally (e.g. because you're only receiving the unwrapped objects from other code), then you can do that with a single Map<Point2D, PointWrapper>
, but then you're effectively back at the previous alternative.
Eclipse Collections has object and primitive maps and has Mutable and Immutable versions for both.
MutableObjectDoubleMap<String> doubleMap = ObjectDoubleMaps.mutable.empty(); doubleMap.put("1", 1.0d); doubleMap.put("2", 2.0d); MutableObjectBooleanMap<String> booleanMap = ObjectBooleanMaps.mutable.empty(); booleanMap.put("ok", true); ImmutableObjectDoubleMap<String> immutableMap = doubleMap.toImmutable(); Assert.assertEquals(doubleMap, immutableMap);
A MutableMap
can be used as a factory for an ImmutableMap
in Eclipse Collections by calling toImmutable
as I have done in the example above. Both mutable and immutable maps share a common parent interface, which in the case of the MutableObjectDoubleMap
and ImmutableObjectDoubleMap
above, is named ObjectDoubleMap
.
Eclipse Collections also has synchronized and unmodifiable versions for all mutable containers in the library. The following code will give you a synchronized view wrapped around the primitive maps.
MutableObjectDoubleMap<String> doubleMap = ObjectDoubleMaps.mutable.<String>empty().asSynchronized(); doubleMap.put("1", 1.0d); doubleMap.put("2", 2.0d); MutableObjectBooleanMap<String> booleanMap = ObjectBooleanMaps.mutable.<String>empty().asSynchronized(); booleanMap.put("ok", true);
This performance comparison of large maps was published a couple of years ago.
Large HashMap overview: JDK, FastUtil, Goldman Sachs, HPPC, Koloboke, Trove – January 2015 version
GS Collections has since been migrated to the Eclipse Foundation and is now Eclipse Collections.
Note: I am a committer for Eclipse Collections.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With