I am trying to implement a function to generate java hashCode equivalent in node.js and python to implement redis sharding. I am following the really good blog @below mentioned link to achieve this http://mechanics.flite.com/blog/2013/06/27/sharding-redis/
But i am stuck at the difference in hashCode if string contains some characters which are not ascii as in below example. for regular strings i could get both node.js and python give me same hash code.
here is the code i am using to generate this:
--Python
def _java_hashcode(s):
hash_code = 0
for char in s:
hash_code = 31*h + ord(char)
return ctypes.c_int32(h).value
--Node as per above blog
String.prototype.hashCode = function() {
for(var ret = 0, i = 0, len = this.length; i < len; i++) {
ret = (31 * ret + this.charCodeAt(i)) << 0;
}
return ret;
};
--Python output
For string '者:s��2�*�=x�' hash is = 2014651066
For string '359196048149234' hash is = 1145341990
--Node output
For string '者:s��2�*�=x�' hash is = 150370768
For string '359196048149234' hash is = 1145341990
Please guide me, where am i mistaking.. do i need to set some type of encoding in python and node program, i tried a few but my program breaks in python.
def java_string_hashcode(s):
"""Mimic Java's hashCode in python 2"""
try:
s = unicode(s)
except:
try:
s = unicode(s.decode('utf8'))
except:
raise Exception("Please enter a unicode type string or utf8 bytestring.")
h = 0
for c in s:
h = int((((31 * h + ord(c)) ^ 0x80000000) & 0xFFFFFFFF) - 0x80000000)
return h
This is how you should do it in python 2.
The problem is two fold:
Also, as in the other answer, for hard-coded non-ascii characters, please save your source file as utf8 and at the top of the file write:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
And make sure if you receive user input that you handle them as unicode type and not string type. (not a problem for python 3)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With