Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Map/ArrayList: which one is faster to search for an element

Tags:

java

I have a gigantic data set which I've to store into a collection and need to find whethere any duplicates in there or not.

The data size could be more than 1 million. I know I can store more element in ArrayList comapre to a Map.

My questions are:

  1. is searching key in a Map faster than searching in sorted ArrayList?
  2. is searching Key in HashMap is faster than TreeMap?
  3. Only in terms of space required to store n elements, which would be more efficient between a TreeMap and a HashMap implementation?
like image 429
Hasan Avatar asked Dec 09 '11 20:12

Hasan


3 Answers

1) Yes. Searching an ArrayList is O(n) on average. The performance of key lookups in a Map depends on the specific implementation. You could write an implementation of Map that is O(n) or worse if you really wanted to, but all the implementations in the standard library are faster than O(n).

2) Yes. HashMap is O(1) on average for simple key lookups. TreeMap is O(log(n)).

Class HashMap<K,V>

This implementation provides constant-time performance for the basic operations (get and put), assuming the hash function disperses the elements properly among the buckets.

Class TreeMap<K,V>

This implementation provides guaranteed log(n) time cost for the containsKey, get, put and remove operations. Algorithms are adaptations of those in Cormen, Leiserson, and Rivest's Introduction to Algorithms.

3) The space requirements will be O(n) in both cases. I'd guess the TreeMap requires slightly more space, but only by a constant factor.

like image 57
Mark Byers Avatar answered Oct 16 '22 06:10

Mark Byers


  1. It depends on the type of Map you're using.
  2. A HashMap has a constant-time average lookup (O(1)), while a TreeMap's average lookup time is based on the depth of the tree (O(log(n))), so a HashMap is faster.
  3. The difference is probably moot. Both data structures require some amount of constant overhead in space complexity by design (both exhibit O(n) space complexity).
like image 29
piersadrian Avatar answered Oct 16 '22 06:10

piersadrian


It just did some benchmark testing on lookup performance between hashmap and sorted arraylist. The answer is hashmap is much faster as the size increase. I am talking about 10x, 20x, 30x faster. I did some test with 1 million of entries using sorted array list and hashmap and the array list get and add operation took seconds to complete, where as the hashmap get and put only takes around 50ms.
Here are something I found or observed: For sorted arraylist, you would have to sort it first to be able to use the search efficiently (binarySearch for example). Practically you don't just have static list (meaning the list will change via add or remove). With that in mind you will need to change the add and the get methods to do "binary" operation to make it efficient (like binarySearch). So even with binary operation the add and get method will be slower and slower as the list grows. Hashmap on the other hand does not show much of change in term of time in the put and get operation. The problem with Hashmap is memory overhead. If you can live with that then go with hashmap.

like image 1
PhucLy Avatar answered Oct 16 '22 08:10

PhucLy