Is there an already implemented data structure that I can use in order to assign to an object (in my case an Edge), an integer? I am reading a graph from a file , 10 mil vertices , 60 mil edges and I assign to each edge , a cost , using a map ( costs.put(e,cost) ).
I create the costs map in this way :
costs = new HashMap<Edge,Integer>();
The exception that it gives is :
java.lang.OutOfMemoryError: Java heap space
at java.util.HashMap.resize(Unknown Source)
at java.util.HashMap.addEntry(Unknown Source)
at java.util.HashMap.put(Unknown Source)
HashMap
is the correct data structure for a basic Map
. The problem you are having is that the JVM is not being instructed to reserve enough space to keep the file contents in memory. Start the JVM with a -Xmx
flag. For instance -Xmx1G
parameter will allow it to use 1 Gigabyte of memory.
You've got 6e7 edges. A plain Object takes 24 bytes (64-bit HotSpot), so that's 1.44e9 bytes right there (1.5 GB). Now you introduce the most efficient map imaginable, adding only 6e7 references plus 6e7 Integer
objects. That's another 2.4e8 bytes for the refs and 1.44e9 bytes for the Integer
s: another 1.5 GB, the total being now 3 GB—and this is the theoretical lower bound for your problem (modulo caching, see below).
Based on this I suggest you just extend your Edge
class with another int
field. This will drastically reduce your memory footprint.
If this is not an option, however, and:
new Integer
, but Integer.valueOf
, autoboxing, etc.,you'll automatically benefit from the built-in small integer cache. If the integers assume values from a wider range, but still with lot of duplication, a custom cache is highly advisable.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With