Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Are owned and borrowed strings guaranteed to hash to the same value?

Tags:

hashmap

hash

rust

String and str both implement Hash, so we can hash either of them. It appears that both owned and borrowed strings currently hash to the same value, so this assertion succeeds:

use std::hash::Hash;
use std::hash::Hasher;
use std::collections::hash_map::DefaultHasher;
pub fn main() {
    let hash1 = {
        let x: String = "abc".to_owned();
        let mut hasher = DefaultHasher::new();
        x.hash(&mut hasher);
        hasher.finish()
    };
    let hash2 = {
        let x: &str = "abc";
        let mut hasher = DefaultHasher::new();
        x.hash(&mut hasher);
        hasher.finish()
    };
    assert!(hash1 == hash2);
}

I'm writing code that leverages on this behaviour in the raw_entry API of HashMap. Specifically, I'm using a HashMap where the keys are enums, but to reduce redundant allocations I want to do the lookup using "borrowed" versions of those enums.

In other words, in the following code I do need the guarantee that both assertions will succeed, regardless of the Hasher implementation being used. It appears to me that this would depend on the guarantees provided by the Hash implementation of String and str.

use std::hash::Hash;
use std::hash::Hasher;
use std::collections::hash_map::DefaultHasher;
pub fn main() {
    {
        #[derive(Hash)]
        enum E1 {
            First(i32),
            Second(String),
        }
        #[derive(Hash)]
        enum E2<'a> {
            First(i32),
            Second(&'a str),
        }
        let hash1 = {
            let x: E1 = E1::First(100);
            let mut hasher = DefaultHasher::new();
            x.hash(&mut hasher);
            hasher.finish()
        };
        let hash2 = {
            let x: E2 = E2::First(100);
            let mut hasher = DefaultHasher::new();
            x.hash(&mut hasher);
            hasher.finish()
        };
        assert!(hash1 == hash2);
        let hash3 = {
            let x: E1 = E1::Second("abc".to_owned());
            let mut hasher = DefaultHasher::new();
            x.hash(&mut hasher);
            hasher.finish()
        };
        let hash4 = {
            let x: E2 = E2::Second("abc");
            let mut hasher = DefaultHasher::new();
            x.hash(&mut hasher);
            hasher.finish()
        };
        assert!(hash3 == hash4);
    }
}

Is it documented anywhere about such guarantees? I would suppose that such guarantees must be provided (otherwise I see no way to properly implement the contains_key() method of HashMap, since the argument can be any borrowed form of the key), but I can't find this guarantee documented anywhere.

like image 396
Bernard Avatar asked Dec 22 '22 18:12

Bernard


1 Answers

Yes. This is guaranteed because String implements Borrow<str>.

Part of the contract for implementations of Borrow is:

Further, when providing implementations for additional traits, it needs to be considered whether they should behave identical to those of the underlying type as a consequence of acting as a representation of that underlying type. Generic code typically uses Borrow<T> when it relies on the identical behavior of these additional trait implementations. These traits will likely appear as additional trait bounds.

In particular Eq, Ord and Hash must be equivalent for borrowed and owned values: x.borrow() == y.borrow() should give the same result as x == y.

In the standard library, the Borrow trait is used in HashMap::get. Borrow makes it possible to pass a &str to get on a HashMap<String, _>. Naturally, in order for this to work, &String and &str must produce the same hash for the same string, hence the requirement on Borrow. The requirement is repeated in the documentation for HashMap::get:

The key may be any borrowed form of the map's key type, but Hash and Eq on the borrowed form must match those for the key type.


Traits cannot define requirements like this in code, so it is possible for non-conforming implementations to exist as the compiler cannot enforce these requirements. However, such implementations would break HashMap.

like image 182
Francis Gagné Avatar answered Dec 26 '22 02:12

Francis Gagné