Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can i generate a long hash of a String?

I have a java applciation in which I want to generate long ids for strings (in order to store those strings in neo4j). In order to avoid data duplication, I would like to generate an id for each string stored in a long integer, which should be unique for each string. How can I do that ?

like image 955
Riduidel Avatar asked Feb 16 '12 10:02

Riduidel


People also ask

How do I make a hash out of a string?

In order to create a unique hash from a specific string, it can be implemented using their own string to hash converting function. It will return the hash equivalent of a string. Also, a library named Crypto can be used to generate various types of hashes like SHA1, MD5, SHA256 and many more.

Can you hash a string in Python?

A hash function is a function that takes input of a variable length sequence of bytes and converts it to a fixed length sequence. It is a one way function. This means if f is the hashing function, calculating f(x) is pretty fast and simple, but trying to obtain x again will take years.

What is a good hash function for a string?

If you just want to have a good hash function, and cannot wait, djb2 is one of the best string hash functions i know. it has excellent distribution and speed on many different sets of keys and table sizes. you are not likely to do better with one of the "well known" functions such as PJW, K&R[1], etc. Also see tpop pp.


4 Answers

This code will calculate pretty good hash:

String s = "some string";
long hash = UUID.nameUUIDFromBytes(s.getBytes()).getMostSignificantBits();
like image 75
Ran Avatar answered Oct 22 '22 03:10

Ran


Why don't you have a look a the hashcode() function of String, and just adopt it to using long values instead?

Btw. if there was a way to create a unique ID for each String, then you would have found a compression algorithm that would be able to pack every String into 8 bytes (not possible by definition).

like image 38
Daniel Avatar answered Oct 22 '22 04:10

Daniel


long has 64 bits. A String of length 9 has 72 bits. from pigeon hole principle - you cannot get a unique hashing for 9 chars long strings to a long.

If you still want a long hash: You can just take two standard [different!] hash functions for String->int, hash1() and hash2() and calculate: hash(s) = 2^32* hash1(s) + hash2(s)

like image 4
amit Avatar answered Oct 22 '22 03:10

amit


There are many answers, try the following:

  • http://stackoverflow.com/questions/415953/generate-md5-hash-in-java EDIT: removed, I've missed the long requirement. Mea culpa.
  • http://en.wikipedia.org/wiki/Perfect_hash_function

Or, as suggested before, check out the sources.

PS. One more technique is to maintain a dictionary of strings: since you're unlikely to get 264 strings any time soon, you can have perfect mapping. Note though that that mapping may as well become a major bottleneck.

like image 1
alf Avatar answered Oct 22 '22 02:10

alf