When we use a HashTable
for storing data, it is said that searching takes o(1) time. I am confused, can anybody explain?
In hashing, all the above operations can be performed in O(1) i.e. constant time. It is important to understand that the worst case time complexity for hashing remains O(n) but the average case time complexity is O(1).
Time Complexity of Double Hashing: Hence total time complexity = O(n) time.
It is technically true because the hash function is not required to use all the information in the key and so could be constant time, and because a large enough table can bring collisions down to near constant time.
The hash table with the best memory efficiency is simply the one with the highest load factor, (it can even exceed 100% memory efficiency by using key compression with compact hashing ). A hash table like that does still provide O(1) lookups, just very slow.
Well it's a little bit of a lie -- it can take longer than that, but it usually doesn't.
Basically, a hash table is an array containing all of the keys to search on. The position of each key in the array is determined by the hash function, which can be any function which always maps the same input to the same output. We shall assume that the hash function is O(1).
So when we insert something into the hash table, we use the hash function (let's call it h) to find the location where to put it, and put it there. Now we insert another thing, hashing and storing. And another. Each time we insert data, it takes O(1) time to insert it (since the hash function is O(1).
Looking up data is the same. If we want to find a value, x, we have only to find out h(x), which tells us where x is located in the hash table. So we can look up any hash value in O(1) as well.
Now to the lie: The above is a very nice theory with one problem: what if we insert data and there is already something in that position of the array? There is nothing which guarantees that the hash function won't produce the same output for two different inputs (unless you have a perfect hash function, but those are tricky to produce). Therefore, when we insert we need to take one of two strategies:
Basically, both approaches are still mostly O(1) but with a hopefully-short linear sequence. We can assume for most purposes that it is O(1). If the hash table is getting too full, those linear searches can become longer and longer, and then it is time to "re-hash" which means to create a new hash table of a much bigger size and insert all the data back into it.
Searching takes O(1) time if there is no collisons in the hashtable , so it is incorrect to sya that searching in a hashtable takes O(1) or constant time.
See how Hashtable works on MSDN?
EDIT
mgiuca explains it beautifully and i am just adding one more Collosion Avoidance technique which is called Chaining.
IN this technique , We maintain a linklist of values at each location so when you have a collosion , your value will be added to the Linklist at that position so when you are searching for a value there may be scenario that you need to search the value in whole link list so in that case searching will not be O(1) operation.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With