Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

is the int value of String.hashCode() unique?

I encountered a problem days ago.Now i have tens of millions of words,type of string. now i decide to keep them in database and use index to keep them unique.And i do not want to compare the original words to keep them unique. I would like to make sure whether the hashCode() method of a string can be unique , will it not be changed if a use another laptop or different time or something like that?

like image 715
congsg2014 Avatar asked Sep 09 '14 03:09

congsg2014


People also ask

What type of value does hashCode () return?

Simply put, hashCode() returns an integer value, generated by a hashing algorithm. Objects that are equal (according to their equals()) must return the same hash code.

What happens if I return my own int value in hashCode?

You would lose any performance given by an hashmap, that can retrieve items from a collection in O(1) time for objects with different hashes, which is what we want to achieve when using HashMaps.

Does hashCode return the same value?

The general contract of hashCode() method is: Multiple invocations of hashCode() should return the same integer value, unless the object property is modified that is being used in the equals() method. An object hash code value can change in multiple executions of the same application.

What does int hashCode () do?

hashCode() method of Integer class in Java is used to return the hash code for a particular Integer .


2 Answers

Unique, no. By nature, hash values are not guaranteed to be unique.

Any system with an arbitrarily large number of possible inputs and a limited number of outputs will have collisions.

So, you won't be able to use a unique database key to store them if it's based only on the hash code. You can, however, use a non-unique key to store them.

In reply to your second question about whether different versions of Java will generate different hash codes for the same string, no.

Provided a Java implementation follows the Oracle documentation (otherwise it's not really a Java implementation), it will be consistent across all implementations. The Oracle docs for String.hashCode specify a fixed formula for calculation the hash:

s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]

You may want to check this is still the case if you're using wildly disparate versions of Java (such as 1.2 vs 8) but it's been like that for a long time, at least since 1.5.

like image 197
paxdiablo Avatar answered Nov 15 '22 06:11

paxdiablo


No,

Because a string in java can have maximum 2,147,483,647 (2^31 - 1) no of characters and all characters will vary so it will produce a very large no of combinations, but integer have only a range from -2,147,483,648 to 2,147,483,648. So it is impossible, and using this method the hash code of a string is computed

s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1].

Example :

If you create two string variables as "FB" and "Ea" there hash code will be same.

like image 41
Abhishek Gharai Avatar answered Nov 15 '22 07:11

Abhishek Gharai