What is the best way to calculate hash code based on values of these string in one pass?
With good I mean that it needs to be:
1 - fast: I need to get hash code for huge list (10^3..10^8 items) of short strings.
2 - identify the whole list of data so many list with maybe only couple of different strings must have different hash codes
How to do it in Java?
Maybe there is a way to use existing string hash code, but how to merge many hash codes calculated for separate strings?
Thank you.
create a placeholder class for you strings and then use CRC32 class. its simple and fast:
import java.util.zip.CRC32;
public class HugeStringCollection {
private Collection<String> strings;
public HugeStringCollection(Collection<String> strings) {
this.strings = strings;
}
public int hashCode() {
CRC32 crc = new CRC32();
for(String string : strings) {
crc.update(string.getBytes())
}
return (int)( crc.getValue() );
}
}
if the collection itself is immutable, you can compute the hash once and store it for lates reuse.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With