Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does hashing have an o(1) search time? [duplicate]

Tags:

java

hashmap

When we use a HashTable for storing data, it is said that searching takes o(1) time. I am confused, can anybody explain?

like image 531
algo-geeks Avatar asked Dec 06 '10 05:12

algo-geeks


People also ask

What is the time complexity of hashing?

In hashing, all the above operations can be performed in O(1) i.e. constant time. It is important to understand that the worst case time complexity for hashing remains O(n) but the average case time complexity is O(1).

What is the time complexity of double hashing?

Time Complexity of Double Hashing: Hence total time complexity = O(n) time.

How does a hash table get close to constant time performance?

It is technically true because the hash function is not required to use all the information in the key and so could be constant time, and because a large enough table can bring collisions down to near constant time.

What is the efficiency of searching a hash table?

The hash table with the best memory efficiency is simply the one with the highest load factor, (it can even exceed 100% memory efficiency by using key compression with compact hashing ). A hash table like that does still provide O(1) lookups, just very slow.


2 Answers

Well it's a little bit of a lie -- it can take longer than that, but it usually doesn't.

Basically, a hash table is an array containing all of the keys to search on. The position of each key in the array is determined by the hash function, which can be any function which always maps the same input to the same output. We shall assume that the hash function is O(1).

So when we insert something into the hash table, we use the hash function (let's call it h) to find the location where to put it, and put it there. Now we insert another thing, hashing and storing. And another. Each time we insert data, it takes O(1) time to insert it (since the hash function is O(1).

Looking up data is the same. If we want to find a value, x, we have only to find out h(x), which tells us where x is located in the hash table. So we can look up any hash value in O(1) as well.

Now to the lie: The above is a very nice theory with one problem: what if we insert data and there is already something in that position of the array? There is nothing which guarantees that the hash function won't produce the same output for two different inputs (unless you have a perfect hash function, but those are tricky to produce). Therefore, when we insert we need to take one of two strategies:

  • Store multiple values at each spot in the array (say, each array slot has a linked list). Now when you do a lookup, it is still O(1) to arrive at the correct place in the array, but potentially a linear search down a (hopefully short) linked list. This is called "separate chaining".
  • If you find something is already there, hash again and find another location. Repeat until you find an empty spot, and put it there. The lookup procedure can follow the same rules to find the data. Now it's still O(1) to get to the first location, but there is a potentially (hopefully short) linear search to bounce around the table till you find the data you are after. This is called "open addressing".

Basically, both approaches are still mostly O(1) but with a hopefully-short linear sequence. We can assume for most purposes that it is O(1). If the hash table is getting too full, those linear searches can become longer and longer, and then it is time to "re-hash" which means to create a new hash table of a much bigger size and insert all the data back into it.

like image 81
mgiuca Avatar answered Oct 21 '22 03:10

mgiuca


Searching takes O(1) time if there is no collisons in the hashtable , so it is incorrect to sya that searching in a hashtable takes O(1) or constant time.

See how Hashtable works on MSDN?

EDIT

mgiuca explains it beautifully and i am just adding one more Collosion Avoidance technique which is called Chaining.

IN this technique , We maintain a linklist of values at each location so when you have a collosion , your value will be added to the Linklist at that position so when you are searching for a value there may be scenario that you need to search the value in whole link list so in that case searching will not be O(1) operation.

like image 26
TalentTuner Avatar answered Oct 21 '22 03:10

TalentTuner