understanding of hash code

Tags:

hash

hash function is important in implementing hash table. I know that in java Object has its hash code, which might be generated from weak hash function.

Following is one snippet that is "supplement hash function"

Click to copy

static int hash(Object x) {
    int h = x.hashCode();

    h += ~(h << 9);
    h ^=  (h >>> 14);
    h +=  (h << 4);
    h ^=  (h >>> 10);
    return h;
}

Can anybody help to explain what is the fundamental idea of a hash algorithm ? to generate non-duplicate integer? If so, how does these bitwise operations make it?

964

asked Jun 25 '10 21:06

3 Answers

A hash function is any well-defined procedure or mathematical function that converts a large, possibly variable-sized amount of data into a small datum, usually a single integer that may serve as an index to an array. The values returned by a hash function are called hash values, hash codes, hash sums, checksums or simply hashes. (wikipedia)

Using more "human" language object hash is a short and compact value based on object's properties. That is if you have two objects that vary somehow - you can expect their hash values to be different. Good hash algorithm produces different values for different objects.

153

answered Sep 29 '22 23:09

Vadym Stetsiak

What you are usually trying to do with a hash algorithm is convert a large search key into a small nonnegative number, so you can look up an associated record in a table somewhere, and do it more quickly than M log2 N (where M is the cost of a "comparison" and N is the number of items in the "table") typical of a binary search (or tree search).

If you are lucky enough to have a perfect hash, you know that any element of your (known!) key set will be hashed to a unique, different value. Perfect hashes are primarily of interest for things like compilers that need to look up language keywords.

In the real world, you have imperfect hashes, where several keys all hash to the same value. That's OK: you now only have to compare the key to a small set of candidate matches (the ones that hash to that value), rather than a large set (the full table). The small sets are traditionally called "buckets". You use the hash algorithm to select a bucket, then you use some other searchable data structure for the buckets themselves. (If the number of elements in a bucket is known, or safely expected, to be really small, linear search is not unreasonable. Binary search trees are also reasonable.)

The bitwise operations in your example look a lot like a signature analysis shift register, that try to compress a long unique pattern of bits into a short, still-unique pattern.

answered Sep 30 '22 01:09

John R. Strohm

Basically, the thing you're trying to achieve with a hash function is to give all bits in the hash code a roughly 50% chance of being off or on given a particular item to be hashed. That way, it doesn't matter how many "buckets" your hash table has (or put another way, how many of the bottom bits you take in order to determine the bucket number)-- if every bit is as random as possible, then an item will always be assigned to an essentially random bucket.

Now, in real life, many people use hash functions that aren't that good. They have some randomness in some of the bits, but not all of them. For example, imagine if you have a hash function whose bits 6-7 are biased-- let's say in the typical hash code of an object, they have a 75% chance of being set. In this made up example, if our hash table has 256 buckets (i.e. the bucket number comes from bits 0-7 of the hash code), then we're throwing away the randomness that does exist in bits 8-31, and a smaller portion of the buckets will tend to get filled (i.e. those whose numbers have bits 6 and 7 set).

The supplementary hash function basically tries to spread whatever randomness there is in the hash codes over a larger number of bits. So in our hypothetical example, the idea would be that some of the randomness from bits 8-31 will get mixed in with the lower bits, and dilute the bias of bits 6-7. It still won't be perfect, but better than before.

answered Sep 30 '22 00:09

Neil Coffey

Related questions
                            
                                Do null SQLite Data fields take up extra memory?
                            
                                Setting session timezone with spring jdbc oracle
                            
                                file transfer using RMI
                            
                                How to externalize web.xml servlet init-param? Spring DelegatingFilterProxy for Servlets?
                            
                                httpURLConnection vs apache commons http
                            
                                Search a string in a file and write the matched lines to another file in Java
                            
                                Is the MATLAB function of 'reshape' available in any Java library?
                            
                                How can I pass init parameters to HttpSessionListener?
                            
                                += int and double conversion [duplicate]
                            
                                Immutability after dependency injection, initialization
                            
                                Understanding the pseudocode in the Donald B. Johnson's algorithm
                            
                                Is PKCS#1 V2.0 implemented for Java?
                            
                                running an axis2 client version 1.5
                            
                                Eclipse plugin or a Open source tool to reverse engineer java code for sequence diagram
                            
                                evaluate boolean values in Java
                            
                                BufferedImage.getGraphics() resulting in memory leak, is there a fix?
                            
                                How can I find the most recent month end date with Joda time?
                            
                                Can Items in a JList be formatted as HTML
                            
                                Android Bluetooth - Can't connect out
                            
                                Is Java MulticastSocket threadsafe?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

understanding of hash code

Tags:

java

hash

SecureFish

People also ask

3 Answers

Vadym Stetsiak

John R. Strohm

Neil Coffey

Recent Activity

Donate For Us