Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

max_by_key on Map doesn't allow destructuring of tuple into key-value pair

Tags:

rust

lifetime

I am learning Rust and fairly good with concept of ownership, borrow, and references. I have reached Ch.8 of the second edition of the Rust Book.

I'm implementing the mode function using map as given in an exercise. I wrote following implementation using Iterator::max_by_key:

use std::collections::HashMap;

fn main() {
    let vs = vec![0, 0, 1, 1, 3, 4, 5, 6, 3, 3, 3];

    let mut counts = HashMap::new();
    for num in vs {
        let count = counts.entry(num).or_insert(0);
        *count += 1;
    }

    // This works
    let u = counts.iter().max_by_key(|v| v.1);

    // This doesn't work
    let v = counts.iter().max_by_key(|(k, v)| v);
}

I get following compiler error

error[E0495]: cannot infer an appropriate lifetime for pattern due to conflicting requirements
  --> src/main.rs:16:43
   |
16 |     let v = counts.iter().max_by_key(|(k, v)| v);
   |                                           ^
   |
note: first, the lifetime cannot outlive the anonymous lifetime #2 defined on the body at 16:38...
  --> src/main.rs:16:38
   |
16 |     let v = counts.iter().max_by_key(|(k, v)| v);
   |                                      ^^^^^^^^^^
note: ...so that reference does not outlive borrowed content
  --> src/main.rs:16:43
   |
16 |     let v = counts.iter().max_by_key(|(k, v)| v);
   |                                           ^
note: but, the lifetime must be valid for the method call at 16:13...
  --> src/main.rs:16:13
   |
16 |     let v = counts.iter().max_by_key(|(k, v)| v);
   |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
note: ...so that a type/lifetime parameter is in scope here
  --> src/main.rs:16:13
   |
16 |     let v = counts.iter().max_by_key(|(k, v)| v);
   |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

What does this error mean and why this is not allowed?

Update 1: Match tuple as input to map solves my problem. If I were using a stable compiler I wouldn't have asked this question. Here I got unintended compile errors, so I'm not closing this as duplicate.

like image 368
Xolve Avatar asked Apr 24 '18 08:04

Xolve


2 Answers

The solution is either to add a single &:

counts.iter().max_by_key(|&(k, v)| v);
//                        ^

... or (on nightly) to add a single *:

counts.iter().max_by_key(|(k, v)| *v);
//                                ^

It follows a detailed explanation with instructions on how to find out yourself. If you don't have the time, there is a summary at the end.


So why does this work?

In order to find out, let's first analyze the type of x in this snippet (which is your first version, but I renamed v to x for clarity):

counts.iter().max_by_key(|x| x.1);

To check the type of x we basically have two possibilities: dig through the docs or let the compiler tell us. Let's dig through the docs first and then confirm that knowledge with the compiler.

So counts is a HashMap<{integer}, {integer}> where {integer} is just some kind of integer: the compiler still has to figure out which integer exactly. If there is no more specific information given (like in your example), the compiler defaults to i32 for integers. To make it easier for us, let's fix the integer types:

let mut counts: HashMap<i32, u32> = HashMap::new();

So now you write counts.iter() ... let's check what this does by looking in the docs:

pub fn iter(&self) -> Iter<K, V>

Now we can either click on Iter to get more information about that type or we can click the exclamation mark on the left:

enter image description here

Either way, we see this important impl:

impl<'a, K, V> Iterator for Iter<'a, K, V>
    type Item = (&'a K, &'a V);

This tells us that the return type of HashMap::iter() is an iterator that yields items of the type (&K, &V) (a 2-tuple of references). Here, K is the key type (i32) and V is the value type (u32) of the hash map. So our iterator yields elements of type (&i32, &u32).

Ok great! Now we need to check Iterator::max_by_key:

fn max_by_key<B, F>(self, f: F) -> Option<Self::Item> 
where
    B: Ord,
    F: FnMut(&Self::Item) -> B, 

It gets slightly complicated but don't worry! We see that the method takes (in addition to self) one argument f: F. This is the closure you pass in. The where clause tells us that F: FnMut(&Self::Item) meaning that F is a function thing that has one argument of the type &Self::Item.

But we already know what the Self::Item of our iterator is: (&i32, &u32). So &Self::Item (with the added reference) is &(&i32, &u32)! This is the type of the closures argument, and thus the type of x.

Let's check if our research was correct. You can easily instruct the compiler to tell you the type of a variable x by enforcing a type error. Let's do it by adding the expression x == (). Here we try to compare your variable to () which never works. And indeed we get the error:

14 |         x == ();
   |           ^^ can't compare `&(&i32, &u32)` with `()`

Success! We correctly found the type of x. So how does this help us?

In the second example, you wrote:

counts.iter().max_by_key(|(k, v)| v);

So you used pattern matching in the argument list of the closure. But one might think: wait, how can the compiler even match a pattern (k, v) to the type &(&i32, &u32)? There is a reference in the beginning that doesn't fit!

And this is exactly what happens on the stable compiler:

error[E0658]: non-reference pattern used to match a reference (see issue #42640)
  --> src/main.rs:18:39
   |
18 |     counts.iter().max_by_key(|(k, v)| v);
   |                               ^^^^^^ help: consider using a reference: `&(k, v)`

You can see that the pattern &(k, v) does fit to &(&i32, &u32) (with k = &i32 and v = &u32).

So talking about the stable compiler, your problem simply was that your pattern didn't fit to the expected type.

So what's up with the nightly error?

Recently, some ergonomic improvements landed in Rust (still nightly only) which can help reduce noisy code in common situations. This particular improvement was proposed in RFC 2005. Such a common situation is matching on a reference of a tuple and wanting to get references to the elements instead, like in this case where we match on the type &(bool, String):

match &(true, "hi".to_string()) {
    // ...
}

So without thinking about references one would probably use the pattern (b, s) (similar like you did with (k, v)). But this doesn't work (on stable) since the pattern doesn't fit (it's missing a reference).

So instead the pattern &(b, s) works -- at least kind of. Because while the pattern matches the type, now s has the type String and is thus trying to move out of the original tuple which is not allowed (since we only had a reference to it).

So what you write instead is: &(b, ref s). Now s has the type &String which is fine.

Since & and ref seems noisy to many people, Rust wants to make these situations easier. Skipping over some details, Rust basically automatically converts a pattern like (a, b) into &(ref a, ref b) when the pattern is used on a reference type. Again, this helps in a few situations, but also introduces a few unexpected references -- like in your example:

counts.iter().max_by_key(|(k, v)| v);

As we saw, the pattern (k, v) actually doesn't fit to the type, but Rust applies the rule and converts your pattern into &(ref k, ref v). Now the pattern matching works, but we have another problem:

Now v is a &&u32: a reference to a reference! (To see why this is the case, just carefully check all the types we discussed above.) But the inner reference is something that only lives as long as the iterator does, so we can't return it and yada yada lifetime problems. The easy solution is simply to remove the outer reference since we don't need it.

We achieve this by either making our pattern explicit (and make it work on stable):

counts.iter().max_by_key(|&(k, v)| v);

Now v is &i32 again (but the i32 value we are referencing lives as long as the hash map, so everything is fine). Or we could instead remove the outer reference by adding a *:

counts.iter().max_by_key(|(k, v)| *v);

This still uses the nightly ergonomic improvement, but removes the outer reference, so that *v is also &i32.

As you might notice, since i32 is Copy we can also add two *.

Summary

Well that was a deep dive into the problem. In short:

  • On stable, your pattern is incompatible with the type ((k, v) doesn't fit to &(&{integer}, &{integer}). So you can fix the problem by fixing your pattern.
  • On nightly (with the RFC 2005 match ergonomics), you were bitten by an additional reference layer introduced by the compiler. This leads to lifetime errors. Luckily you don't need this additional reference so you can simply remove it.
like image 153
Lukas Kalbertodt Avatar answered Nov 16 '22 04:11

Lukas Kalbertodt


In short, use a reference (playground)

let v = counts.iter().max_by_key(|&(_, v)| v);

In long, the first example works, because v is copyable, which means you will get a copy of v in your closure. Tuples are not copyable, which means the tuple will get moved out of the hashmap, which is not permitted, that's why you have to use a reference.

like image 36
hellow Avatar answered Nov 16 '22 02:11

hellow