Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Hashtable with doubly linked lists?

Introduction to Algorithms (CLRS) states that a hash table using doubly linked lists is able to delete items more quickly than one with singly linked lists. Can anybody tell me what is the advantage of using doubly linked lists instead of single linked list for deletion in Hashtable implementation?

like image 458
Dan Paradox Avatar asked Jul 28 '11 07:07

Dan Paradox


People also ask

Can a hash table be a linked list?

A hash table is just an array of linked lists. Each linked list holds all the items in the table that share the same hash code. Initially, all the lists are empty (represented as null in the array). We need to be able to add and delete items in the list.

Is a Hashtable an ADT?

The Hash Table is an incredibly useful abstract data type (ADT). When you need to store key-value pairs securely, this would be your go to.

What is the advantage of a Hashtable Over linked list?

What is the advantage of the hash table over a linked list? Explanation: Hash table is a data structure that has an advantage that it allows fast access of elements. But linked list is easier to implement as compared to the hash table.

What is the time complexity of search function in a hash table using a Doubly Linked List?

What is the time complexity of search function in a hash table using a doubly linked list? Explanation: Time complexity of search function in a hash table is O(1). Condition is that the number of collisions should be low.


2 Answers

The confusion here is due to the notation in CLRS. To be consistent with the true question, I use the CLRS notation in this answer.

We use the hash table to store key-value pairs. The value portion is not mentioned in the CLRS pseudocode, while the key portion is defined as k.

In my copy of CLR (I am working off of the first edition here), the routines listed for hashes with chaining are insert, search, and delete (with more verbose names in the book). The insert and delete routines take argument x, which is the linked list element associated with key key[x]. The search routine takes argument k, which is the key portion of a key-value pair. I believe the confusion is that you have interpreted the delete routine as taking a key, rather than a linked list element.

Since x is a linked list element, having it alone is sufficient to do an O(1) deletion from the linked list in the h(key[x]) slot of the hash table, if it is a doubly-linked list. If, however, it is a singly-linked list, having x is not sufficient. In that case, you need to start at the head of the linked list in slot h(key[x]) of the table and traverse the list until you finally hit x to get its predecessor. Only when you have the predecessor of x can the deletion be done, which is why the book states the singly-linked case leads to the same running times for search and delete.

Additional Discussion

Although CLRS says that you can do the deletion in O(1) time, assuming a doubly-linked list, it also requires you have x when calling delete. The point is this: they defined the search routine to return an element x. That search is not constant time for an arbitrary key k. Once you get x from the search routine, you avoid incurring the cost of another search in the call to delete when using doubly-linked lists.

The pseudocode routines are lower level than you would use if presenting a hash table interface to a user. For instance, a delete routine that takes a key k as an argument is missing. If that delete is exposed to the user, you would probably just stick to singly-linked lists and have a special version of search to find the x associated with k and its predecessor element all at once.

like image 153
David Alber Avatar answered Nov 01 '22 04:11

David Alber


Unfortunately my copy of CLRS is in another country right now, so I can't use it as a reference. However, here's what I think it is saying:

Basically, a doubly linked list supports O(1) deletions because if you know the address of the item, you can just do something like:

x.left.right = x.right;
x.right.left = x.left;

to delete the object from the linked list, while as in a linked list, even if you have the address, you need to search through the linked list to find its predecessor to do:

pred.next = x.next

So, when you delete an item from the hash table, you look it up, which is O(1) due to the properties of hash tables, then delete it in O(1), since you now have the address.

If this was a singly linked list, you would need to find the predecessor of the object you wish to delete, which would take O(n).


However:

I am also slightly confused about this assertion in the case of chained hash tables, because of how lookup works. In a chained hash table, if there is a collision, you already need to walk through the linked list of values in order to find the item you want, and thus would need to also find its predecessor.

But, the way the statement is phrased gives clarification: "If the hash table supports deletion, then its linked lists should be doubly linked so that we can delete an item quickly. If the lists were only singly linked, then to delete element x, we would first have to find x in the list T[h(x.key)] so that we could update the next attribute of x’s predecessor."

This is saying that you already have element x, which means you can delete it in the above manner. If you were using a singly linked list, even if you had element x already, you would still have to find its predecessor in order to delete it.

like image 1
Patrick Avatar answered Nov 01 '22 04:11

Patrick