Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Are there any HashMap implementations with consistent ordering between program runs?

Tags:

hashmap

rust

I've observed that HashMap has a different order of elements even with the same data on the next program start. It looks like HashMap uses some absolute addresses to sort elements. Is there any other HashMap implementation, which has the same behaviour if the same data was inserted?

like image 520
Stepan Yakovenko Avatar asked Aug 26 '17 10:08

Stepan Yakovenko


People also ask

Does HashMap guarantee default ordering?

HashMap is unordered; you can't and shouldn't assume anything beyond that. This class makes no guarantees as to the order of the map; in particular, it does not guarantee that the order will remain constant over time.

Does HashMap maintain order?

HashMap does not maintains insertion order in java. Hashtable does not maintains insertion order in java. LinkedHashMap maintains insertion order in java.

Does map maintain insertion order?

P.S HashMap does not guarantee insertion order.

Does HashMap sort automatically?

No, HashMap s don't sort their keys automatically. You want a TreeMap for sorting the keys, or a LinkedHashMap to retain the insertion order.


2 Answers

I've observed that HashMap has a different order of elements even with the same data on the next program start.

You don't have to observe anything, this is documented by HashMap:

By default, HashMap uses a hashing algorithm selected to provide resistance against HashDoS attacks. The algorithm is randomly seeded, and a reasonable best-effort is made to generate this seed from a high quality, secure source of randomness provided by the host without blocking the program.

It's worth noting that this means that two HashMaps with the same set of inserted values in the same program run will likely have different ordering:

use std::collections::HashMap;

fn main() {
    let a = (0..100).zip(100..200);

    let hash_one: HashMap<_, _> = a.clone().collect();
    let hash_two: HashMap<_, _> = a.clone().collect();

    // prints "false", most of the time
    println!("{}", hash_one.into_iter().eq(hash_two));
}

The documentation also tells you how to address the problem:

The hashing algorithm can be replaced on a per-HashMap basis using the default, with_hasher, and with_capacity_and_hasher methods. Many alternative algorithms are available on crates.io, such as the fnv crate.

Since I worked on twox-hash, I'll show that as an example:

use std::hash::BuildHasherDefault;
use std::collections::HashMap;
use twox_hash::XxHash;

let mut hash: HashMap<_, _, BuildHasherDefault<XxHash>> = Default::default();
hash.insert(42, "the answer");
assert_eq!(hash.get(&42), Some(&"the answer"));

That being said, relying on the order of a HashMap sounds like a bad idea. Perhaps you should use a different data structure, such as a BTreeMap.

In other cases, you actually care about the order of insertion. For that, the indexmap crate is appropriate.

like image 98
Shepmaster Avatar answered Oct 20 '22 00:10

Shepmaster


I believe linked-hash-map is the de facto crate for this.

like image 25
Jimmy Avatar answered Oct 20 '22 01:10

Jimmy