Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Murmurhash 2 results on Python and Haskell

Haskell and Python don't seem to agree on Murmurhash2 results. Python, Java, and PHP returned the same results but Haskell don't. Am I doing something wrong regarding Murmurhash2 on Haskell?

Here is my code for Haskell Murmurhash2:

import Data.Digest.Murmur32

    main = do
    print $ asWord32 $ hash32WithSeed 1 "woohoo"

And here is the code written in Python:

import murmur

if __name__ == "__main__":
    print murmur.string_hash("woohoo", 1)

Python returned 3650852671 while Haskell returned 3966683799

like image 254
Axel Advento Avatar asked May 03 '13 07:05

Axel Advento


1 Answers

From a quick inspection of the sources, it looks like the algorithm operates on 32 bits at a time. The Python version gets these by simply grabbing 4 bytes at a time from the input string, while the Haskell version converts each character to a single 32-bit Unicode index.

It's therefore not surprising that they yield different results.

like image 191
hammar Avatar answered Sep 18 '22 09:09

hammar