When coding extendible hashing, one has the choice of using the most significant bits or the least significant bits of the hash value in order to determine which bucket to hash to. Using least significant bits has a number of advantages:
What I can't wrap my head around is why reference after reference after reference shows extendible hashing done with most significant bits. As far as I can tell, the only advantage most significant bits yields is a diagram on paper (or on screen) that doesn't have crossing lines. Is there any good reason why so many sources so most significant bits instead of least?
I finally went back to the original source paper by Fagin, et. al. They address this:
"We note that if we had used suffixes of pseudokeys instead of prefixes, then the algorithm for doubling the directory would be especially easy: it would essentially consist of making a second copy of the nonheader portion of the directory, immediately after the first copy. However, we chose to use prefixes for the sake of intuitive simplicity (thus, by using prefixes the keys can easily be accessed in pseudokey order, rather than in inverted pseudokey order). "
I don't understand why they saw this approach as more intuitive, as you could dispense with the whole bit idea and go with modular arithmetic instead, but it appears that this was at least their rationale.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With