Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Generate ID from string in Python

Tags:

python

hash

I'm struggling a bit to generate ID of type integer for given string in Python.

I thought the built-it hash function is perfect but it appears that the IDs are too long sometimes. It's a problem since I'm limited to 64bits as maximum length.

My code so far: hash(s) % 10000000000. The input string(s) which I can expect will be in range of 12-512 chars long.

Requirements are:

  • integers only
  • generated from provided string
  • ideally up to 10-12 chars long (I'll have ~5 million items only)
  • low probability of collision..?

I would be glad if someone can provide any tips / solutions.

like image 845
mlen108 Avatar asked Apr 09 '14 21:04

mlen108


People also ask

How do I find the ID of a string in Python?

The id() function returns a unique id for the specified object. All objects in Python has its own unique id. The id is assigned to the object when it is created.

How do I generate a random ID in Python?

uuid1() is defined in UUID library and helps to generate the random id using MAC address and time component. bytes : Returns id in form of 16 byte string. int : Returns id in form of 128-bit integer. hex : Returns random id as 32 character hexadecimal string.

How do you get a student ID in Python?

You should create an instance of the Student class, then call it's method. just like user = Student() , then call user. enter_name() .


1 Answers

I would do something like this:

>>> import hashlib
>>> m = hashlib.md5()
>>> m.update("some string")
>>> str(int(m.hexdigest(), 16))[0:12]
'120665287271'

The idea:

  1. Calculate the hash of a string with MD5 (or SHA-1 or ...) in hexadecimal form (see module hashlib)
  2. Convert the string into an integer and reconvert it to a String with base 10 (there are just digits in the result)
  3. Use the first 12 characters of the string.

If characters a-f are also okay, I would do m.hexdigest()[0:12].

like image 125
Stephan Kulla Avatar answered Oct 02 '22 16:10

Stephan Kulla