Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there any efficient way to have a case insensitive string as a HashMap key?

I'm trying to count number of times a word appears in a text. I'm using HashMap and my implementation ignores case. I achieve that by converting all words to lowercase:

for line in reader.lines() {
    for mut curr in line.as_ref().unwrap().split_whitespace() {
        match word_map.entry(curr.to_string().to_lowercase()) {
            Entry::Occupied(entry) => {
                *entry.into_mut() += 1;
            }
            Entry::Vacant(entry) => {
                entry.insert(1);
            }
        }
    }
}

I want to consider "the" and "The" same, but if "the" doesn't appear just hold "The" in the HashMap. Right now, I hold all words in lowercase. Is there any efficient way to do this?

like image 304
Mehmet Hakan Kurtoğlu Avatar asked Dec 14 '17 12:12

Mehmet Hakan Kurtoğlu


People also ask

Are Keys in Hashmap case-sensitive?

Map is one of the most common data structures in Java, and String is one of the most common types for a map's key. By default, a map of this sort has case-sensitive keys.

How do you do case-insensitive String comparison?

Comparing strings in a case insensitive manner means to compare them without taking care of the uppercase and lowercase letters. To perform this operation the most preferred method is to use either toUpperCase() or toLowerCase() function. toUpperCase() function: The str.

Is Map Containskey case-sensitive?

A Map is a collection that contains key-value pairs. As one of the requirements, Map keys should be immutable. Due to their immutable nature, Strings are widely used as keys in Maps. By default, String keys in a Map are case-sensitive.


1 Answers

The easiest way to do it is to use UniCase as a key:

use unicase::UniCase;

type Words = std::collections::HashMap<UniCase, u32>;

If I understand their documentation, UniCase::new("The") stores the actual string "The" in it, but if you compare it with Unicase::new("the"), you will see that it is the same string.

like image 165
Boiethios Avatar answered Oct 14 '22 19:10

Boiethios