I'm trying to count number of times a word appears in a text. I'm using HashMap
and my implementation ignores case. I achieve that by converting all words to lowercase:
for line in reader.lines() {
for mut curr in line.as_ref().unwrap().split_whitespace() {
match word_map.entry(curr.to_string().to_lowercase()) {
Entry::Occupied(entry) => {
*entry.into_mut() += 1;
}
Entry::Vacant(entry) => {
entry.insert(1);
}
}
}
}
I want to consider "the" and "The" same, but if "the" doesn't appear just hold "The" in the HashMap
. Right now, I hold all words in lowercase. Is there any efficient way to do this?
Map is one of the most common data structures in Java, and String is one of the most common types for a map's key. By default, a map of this sort has case-sensitive keys.
Comparing strings in a case insensitive manner means to compare them without taking care of the uppercase and lowercase letters. To perform this operation the most preferred method is to use either toUpperCase() or toLowerCase() function. toUpperCase() function: The str.
A Map is a collection that contains key-value pairs. As one of the requirements, Map keys should be immutable. Due to their immutable nature, Strings are widely used as keys in Maps. By default, String keys in a Map are case-sensitive.
The easiest way to do it is to use UniCase
as a key:
use unicase::UniCase;
type Words = std::collections::HashMap<UniCase, u32>;
If I understand their documentation, UniCase::new("The")
stores the actual string "The"
in it, but if you compare it with Unicase::new("the")
, you will see that it is the same string.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With