I understand that this seems to be already discussed and the answer is yes, String.hashCode
can generate equal vales for different strings, but quite unlikely (Can Java's hashCode produce same value for different strings?). However it does happen in my application.
The following code will produce the same hashcode: -347019262 (jave 1.7.25)
String string1 = "/m/06qw_";
String string2="/m/0859_";
System.out.println(string1+","+string1.hashCode());
System.out.println(string2+","+string2.hashCode());
I do need hashcode in this case, and I want to use it to generate a unique primary key for a string. it seems that I am not doing it right. Any suggestions please?
Many thanks!
If two string objects are equal, the GetHashCode method returns identical values. However, there is not a unique hash code value for each unique string value. Different strings can return the same hash code. The hash code itself is not guaranteed to be stable.
Getting the hash code of a string is simple in C#. We use the GetHashCode() method. A hash code is a uniquely identified numerical value. Note that strings that have the same value have the same hash code.
Simply put, hashCode() returns an integer value, generated by a hashing algorithm. Objects that are equal (according to their equals()) must return the same hash code. Different objects do not need to return different hash codes.
hashCode in Java helps the program to run faster. For example, comparing two objects by their hashcodes will give the result 20 times faster than comparing them using the equals() function. This is so because hash data structures like HashMaps, internally organize the elements in an array-based data structure.
You misunderstand .hashCode()
.
One part of the contract is that objects who are equals()
must have the same hashCode()
. However, the reverse is not true: two objects who have the same hashCode()
do not have to be equals()
.
This is a valid, albeit perfectly useless, hashCode()
implementation:
@Override
public int hashCode()
{
return 42; // universal answer
}
You should use the string itself as the "primary key". If you want a "more efficient" key, you should consider what format the input string is and, if possible, extract a significant part of this input.
The sensible option is to use the string as the primary key. (Another choice would be to associate a GUID with your data record and have that as the primary key.)
Hashing is meant to be (1) fast and (2) such that two equal strings will have the same hash code.
I'd submit it's likely that you'll get hashing clashes; after all an int
(the hash return type) only has about 4 billion distinct values.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With