Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is searching a hashtable for a value that isn't there O(n)? (linear probing)

Just trying to understand the linear probing logic.

With as hashtable using open addressing, how can you ever confirm that an element isn't in the table.

For example, say you had a 10 bucket hashmap. Suppose you hash a key, and insert it. Now, if element A and B are to be inserted and hash and reduce to the same bucket then element A and B if using a linear probe will likely be next to each other.

Now, just because a bucket is empty, does not seem to mean that an element doesn't exist in the map. e.g. You search for element B after element A is first removed. First you get an empty bucket where you expect B to be, but you need to probe one more and you'll find B. It's really is there. If that logic is correct then won't you have to search the entire table to confirm whether a key is not there? i.e. O(n) performance everytime.

What I'm saying is, don't you need to go through the whole map to truly confirm it's not there?

With a separate chaining approach to hashmap's that problem doesn't exist.

For example: enter image description here

If John Smith is deleted, and we try to locate Sandra Dee.

Or with linear addressing are you meant to move elements around so that there are no holes as such. i.e. If John Smith is deleted should elements from 152 to 154 be shifted back one place? It don't really see that in the description but that might make some sense. Except if ted baker hashed to 154 and not 153 as described. Requires a bit more work than I thought.

Might just go with a simple linked list at each bucket.

like image 811
hookenz Avatar asked Jun 14 '11 03:06

hookenz


2 Answers

In the absolute worst case, yes the algorithm to determine whether or not some item is in the table is O(n). However, this will never happen in a properly managed hash table.

When an item is removed, a tombstone should be placed at the table slot that it was removed from. The tombstone is simply some data to indicate that there used to be an element there, but it has been removed. In this way, every time you search for an element, you must follow whatever probe sequence you are using until you find a slot that is empty. If a slot is empty you have completed the probe sequence for that hash value and know that it will not be at any other place in the table.

The only way that you would have to search through every slot in the probe sequence is if there are no empty slots in the probe sequence. Since you should always aim for a hash table to be about half empty, this should not happen.

like image 162
Deddryk Avatar answered Nov 15 '22 06:11

Deddryk


Use of a hashtable that resolves conflicts with a probing strategy places stringent requirements on the deletion capabilities of the data structure. When you delete an item, you must compensate the hashtable so that it still meets the requirements needed for search (which is the main point of a hashtable).

Using linear probing, if you delete an item, you move to the next slot. If it matches the hash for the slot we just deleted, move it. Rinse and repeat until you get to an empty slot. There are also lazy deletion strategies as well, where items are marked for deletion, and then actually deleted/compensated on the next search.

Assume three items {A0, A1, A2} that have the same Hash value. Let {A0,A1} be in the Hashtable and {A2} is not. If we delete A0, we mark it for deletion. When we do a search for A2 (not in the Hashtable) we find A0 (deleted), we then move to A1 which we relocate into A0's slot, finalizing the deletion. We move to the next slot (which might be A2, or a candidate for the slot A1 was just occupying) but find it its empty so we clear out the slot that A1 just vacated and our hashtable is back to a pristine state.

like image 36
Ethan Cabiac Avatar answered Nov 15 '22 06:11

Ethan Cabiac