Say I want to store a dictionary of strings and I want to know if some string exists or not. I can use a Trie or a HashMap. The HashMap has a time complexity of O(1) with a high probability while the Trie in that case would have a time complexity of O(k) where k is the length of the string.
Now my question is: Doesn't calculating the hash value of the string have a time complexity of O(k) thus making the complexity of the HashMap the same? If not, why?
The way I see it is that a Trie here would have lower time complexity than a HashMap for looking up a string since the HashMap -in addition to calculating the hash value- might hit collisions. Am I missing something?
Update: Which data structure would you use to optimize for speed when constructing a dictionary?
Apart from the complexity of implementation of a trie, certain optimizations are done in the implementation of the hashCode
method that determines the buckets in a hash table. For java.lang.String
, an immutable class, here is what JDK-8 does:
public int hashCode() {
int h = hash;
if (h == 0 && value.length > 0) {
char val[] = value;
for (int i = 0; i < value.length; i++) {
h = 31 * h + val[i];
}
hash = h;
}
return h;
}
Thus, it is cached (and is thread-safe). Once calculated, the hash code of a string need not be recalculated. This saves you from having to spend the O(k)
time in the case of hash table (or hash set, hash map).
While implementing dictionaries, I think tries shine where you are more interested in possible partial matches rather than exact matches. Generally speaking hash based solutions work best in case of exact matches.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With