I have a pretty large Hashmap (~250MB). Creating it takes about 50-55 seconds, so I decided to serialize it and save it to a file. Reading from the file takes about 16-17 seconds now.
The only problem is that lookups seems to be slower this way. I always thought that the hashmap is read from the file into the memory, so the performance should be the same compared to the case when I create the hashmap myself, right? Here is the code I am using to read the hashmap into a file:
File file = new File("omaha.ser");
FileInputStream f = new FileInputStream(file);
ObjectInputStream s = new ObjectInputStream(new BufferedInputStream(f));
omahaMap = (HashMap<Long, Integer>) s.readObject();
s.close();
300 million lookups take about 3.1 seconds when I create the hashmap myself, and about 8.5 seconds when I read the same hashmap from file. Does anybody have an idea why? Am I overlooking something obvious?
EDIT:
I "measured" the time by just taking the time with System.nanotime(), so no proper benchmark method used. Here is the code:
public class HandEvaluationTest
{
public static void Test()
{
HandEvaluation.populate5Card();
HandEvaluation.populate9CardOmaha();
Card[] player1cards = {new Card("4s"), new Card("2s"), new Card("8h"), new Card("4d")};
Card[] player2cards = {new Card("As"), new Card("9s"), new Card("6c"), new Card("2h")};
Card[] player3cards = {new Card("9h"), new Card("7h"), new Card("Kc"), new Card("Kh")};
Card[] table = {new Card("2d"), new Card("2c"), new Card("3c"), new Card("5c"), new Card("4h")};
int j=0, k=0, l=0;
long startTime = System.nanoTime();
for(int p=0; p<100000000; p++) {
j = HandEvaluation.handEval9Hash(player1cards, table);
k = HandEvaluation.handEval9Hash(player2cards, table);
l = HandEvaluation.handEval9Hash(player3cards, table);
}
long estimatedTime = System.nanoTime() - startTime;
System.out.println("Time needed: " + estimatedTime*Math.pow(10,-6) + "ms");
System.out.println("Handstrength Player 1: " + j);
System.out.println("Handstrength Player 2: " + k);
System.out.println("Handstrength Player 3: " + l);
}
}
The big hashmap work is done in HandEvaluation.populate9CardOmaha(). The 5-card one is small. The code for the big one:
public static void populate9CardOmaha()
{
//Check if the hashmap is already there- then just read it and exit
File hashmap = new File("omaha.ser");
if(hashmap.exists())
{
try
{
File file = new File("omaha.ser");
FileInputStream f = new FileInputStream(file);
ObjectInputStream s = new ObjectInputStream(new BufferedInputStream(f));
omahaMap = (HashMap<Long, Integer>) s.readObject();
s.close();
}
catch(IOException ioex) {ioex.printStackTrace();}
catch(ClassNotFoundException cnfex)
{
System.out.println("Class not found");
cnfex.printStackTrace();
return;
}
return;
}
// if it's not there, populate it yourself
... Code for populating hashmap ...
// and then save it to file
(
try
{
File file = new File("omaha.ser");
FileOutputStream f = new FileOutputStream(file);
ObjectOutputStream s = new ObjectOutputStream(new BufferedOutputStream(f));
s.writeObject(omahaMap);
s.close();
}
catch(IOException ioex) {ioex.printStackTrace();}
}
When i am populating it myself (= file is not here), lookups in the HandEvaluationTest.Test() take about 8 seconds instead of 3. Maybe it's just my very naive way of measuring the time elapsed?
HashMap's Bottleneck The bucket is actually a simple linked list. Finding elements in the linked list isn't very fast (O(n)) but that's not a problem if the list is very small. Problems start when we have a lot of hash code collisions, so instead of a big number of small buckets, we have a small number of big buckets.
The reason that HashMap is faster than HashSet is that the HashMap uses the unique keys to access the values. It stores each value with a corresponding key and we can retrieve these values faster using keys during iteration. While HashSet is completely based on objects and therefore retrieval of values is slower.
hashmap lookups are instant (“constant time”) You might think that this check – if y in set1 – would be slower if the set1 contains 10 million elements than it is if set1 contains 1000 elements. But it's not! It always takes basically the same amount of time (SUPER FAST), no matter how big set1 gets.
HashMap 's get has an expected constant running time, which means its running time shouldn't depend on the size of the HashMap . This, of course, relies on a decent implementation of the hashCode method of your key, but your key is String , so it shouldn't be a problem.
This question was interesting, so I wrote my own test case to verify it. I found no difference in speed for a live lookup Vs one that was loaded from a serialized file. The program is available at the end of the post for anyone interested in running it.
230 MB
.Java(TM) SE Runtime Environment (build 1.6.0_25-b06)
/ 4 CPUs running at 1.7 Ghz
/ 4GB Ram 800 Mhz
Measuring is tricky. I myself noticed the 8 second
lookup time that you described, but guess what else I noticed when that happened.
Your measurements are probably picking that up too. If you isolate the measurements of Map.get()
alone you'll see that the results are comparable.
public class GenericTest
{
public static void main(String... args)
{
// Call the methods as you please for a live Vs ser <-> de_ser run
}
private static Map<Long, Integer> generateHashMap()
{
Map<Long, Integer> map = new HashMap<Long, Integer>();
final Random random = new Random();
for(int counter = 0 ; counter < 10000000 ; counter++)
{
final int value = random.nextInt();
final long key = random.nextLong();
map.put(key, value);
}
return map;
}
private static void lookupItems(int n, Map<Long, Integer> map)
{
final Random random = new Random();
for(int counter = 0 ; counter < n ; counter++)
{
final long key = random.nextLong();
final Integer value = map.get(key);
}
}
private static void serialize(Map<Long, Integer> map)
{
try
{
File file = new File("temp/omaha.ser");
FileOutputStream f = new FileOutputStream(file);
ObjectOutputStream s = new ObjectOutputStream(new BufferedOutputStream(f));
s.writeObject(map);
s.close();
}
catch (Exception e)
{
e.printStackTrace();
}
}
private static Map<Long, Integer> deserialize()
{
try
{
File file = new File("temp/omaha.ser");
FileInputStream f = new FileInputStream(file);
ObjectInputStream s = new ObjectInputStream(new BufferedInputStream(f));
HashMap<Long, Integer> map = (HashMap<Long, Integer>) s.readObject();
s.close();
return map;
}
catch (Exception e)
{
throw new RuntimeException(e);
}
}
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With