Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a "good enough" hash function for the average programmer?

We are told that we should implement hashCode() for our classes but most people like me have no real idea of how to do this or what happens if we get it "wrong". For example I have a need for a hash function for indexing nodes in a tree (Finding the most frequent subtrees in a collection of (parse) trees). In this case I need to recursively generate hashcodes based on ordered child nodes, e.g.

hashCode = function(child1.hashCode, child2.hashCode, ...)

In a recent discussion of hashCodes answers included a hash for strings (based on a long prime and 31) and also bitshifting. The String hash is:

// adapted from String.hashCode()
public static long hash(String string) {
  long h = 1125899906842597L; // prime
  int len = string.length();

  for (int i = 0; i < len; i++) {
    h = 31*h + string.charAt(i);
  }
  return h;
}

I'm not interested in security and don't mind collisions. Is there a "universal function" for combining hashcodes of ordered objects that will do more good than harm (and more good than not calling it at all)?

Also is there a site where we can look up common cases? strings, lists, etc.)

I didn't specify a language as I was hoping there were universal approaches. But if it is seriously language-specific then please indicate the language and why it's not universal.

UPDATE Two suggestions are to use the IDE's hashCode generator. That seems an excellent default; Here's Netbeans:

public int hashCode() {
    int hash = 5;
// objects
    hash = 97 * hash + (this.rootElement != null ? this.rootElement.hashCode() : 0);
    hash = 97 * hash + (this.tableElement != null ? this.tableElement.hashCode() : 0);
// a string
    hash = 97 * hash + (this.tag != null ? this.tag.hashCode() : 0);
    return hash;
}
like image 265
peter.murray.rust Avatar asked Nov 06 '09 19:11

peter.murray.rust


1 Answers

There is an excellent hashCode() in Joshua Bloch's Effective Java. Sample Chapter 3, "Methods Common to All Objects" was free (well, it used to be back when there was a page on Sun's old site for it. If you search you might still find a PDF version of that chapter lying around somewhere).

You could also look at the source for HashCodeBuilder in Apache Commons Lang, but just don't copy it for class without citing it. It is time well spent to actually learn about this -- it will make you a better person.

like image 73
dustmachine Avatar answered Oct 04 '22 07:10

dustmachine