Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What does "bucket entries" mean in the context of a hashtable?

Tags:

hashtable

What does "bucket entries" mean in the context of a hashtable?

like image 212
user1072898 Avatar asked Jan 31 '12 03:01

user1072898


People also ask

What is the number of buckets in a hash table?

The number of buckets in a hash structure will almost always be on the order of the number of items in the hash structure. The phrase "on the order of" is intentionally imprecise. That means you could have twice as many buckets as items. Or two times as many items as buckets.

What is bucket in data structure?

A bucket data structure is a data structure that uses the key values as the indices of the buckets, and store items of the same key value in the corresponding bucket. Naturally it makes the most sense to use the bucket data structure with integer key values.

Which bucket does each hash key map to?

A hash function maps each key to an integer in the range [0, N -1], where N is the capacity of the bucket array for the hash table. The main idea is to use the hash value, h(k), as an index into our bucket array, A, instead of the key k (which is most likely inappropriate for use as a bucket array index).

How is a hash table implemented?

Hashing is implemented in two steps: An element is converted into an integer by using a hash function. This element can be used as an index to store the original element, which falls into the hash table. The element is stored in the hash table where it can be quickly retrieved using hashed key.


1 Answers

A bucket is simply a fast-access location (like an array index) that is the the result of the hash function.

The idea with hashing is to turn a complex input value into a different value which can be used to rapidly extract or store data.

Consider the following hash function for mapping people's names into street addresses.

First take the initials from the first and last name and turn them both into numeric values (0 through 25, from A through Z). Multiply the first by 26 and add the second, and this gives you a value from 0 to 675 (26 * 26 distinct values, or bucket IDs). This bucket ID is then to be used to store or retrieve the information.


Now you can have a perfect hash (where each allowable input value maps to a distinct bucket ID) so that a simple array will suffice for the buckets. In that case, you can just maintain an array of 676 street addresses and use the bucket ID to find the one you want:

+-------------------+ | George Washington | -> hash(GW) +-------------------+      |                            +-> GwBucket[George's address] +-------------------+ |  Abraham Lincoln  | -> hash(AL) +-------------------+      |                            +-> AlBucket[Abe's address] 

However, this means that George Wendt and Allan Langer are going to cause problems in the future.


Or you can have an imperfect hash (such as one where John Smith and Jane Seymour would end up with the same bucket ID).

In that case, you need a more complex backing data structure than a simple array, to maintain a collection of addresses. This could be as simple as a linked list, or as complex as yet another hash:

+------------+       +--------------+ | John Smith |       | Jane Seymour | +------------+       +--------------+       |                     |       V                     V    hash(JS)              hash(JS)       |                     |       +-----> JsBucket <----+                  \/ +-----------------------------------+ | John Smith   ->  [John's address] | | Jane Seymour ->  [Jane's address] | +-----------------------------------+ 

Then, as well as the initial hash lookup, an extra level of searching needs to be carried out within the bucket itself, to find the specific information.

like image 94
paxdiablo Avatar answered Oct 12 '22 15:10

paxdiablo