<p>I am reading the section on closures in the second edition of the Rust book. At the end of this section, there is an exercise to extend the <code>Cacher</code> implementation given before. I gave it a try:</p> <pre class="prettyprint"><code>use std::clone::Clone; use std::cmp::Eq; use std::collections::HashMap; use std::hash::Hash; struct Cacher<T, K, V> where T: Fn(K) -> V, K: Eq + Hash + Clone, V: Clone, { calculation: T, values: HashMap<K, V>, } impl<T, K, V> Cacher<T, K, V> where T: Fn(K) -> V, K: Eq + Hash + Clone, V: Clone, { fn new(calculation: T) -> Cacher<T, K, V> { Cacher { calculation, values: HashMap::new(), } } fn value(&mut self, arg: K) -> V { match self.values.clone().get(&arg) { Some(v) => v.clone(), None => { self.values .insert(arg.clone(), (self.calculation)(arg.clone())); self.values.get(&arg).unwrap().clone() } } } } </code></pre> <p>After creating a version that finally works, I am really unhappy with it. What really bugs me is that <code>cacher.value(...)</code> has 5(!) calls to <code>clone()</code> in it. Is there a way to avoid this?</p>

<p>Your suspicion is correct, the code contains too many calls to <code>clone()</code>, defeating the very optimizations <code>Cacher</code> is designed to achieve.</p> <h3>Cloning the entire cache</h3> <p>The one to start with is the call to <code>self.values.clone()</code> - it creates a copy of the entire cache <em>on every single access</em>.</p> <h3>After non-lexical lifetimes</h3> <p>Remove this clone.</p> <h3>Before non-lexical lifetimes</h3> <p>As you likely discovered yourself, simply removing <code>.clone()</code> doesn't compile. This is because the borrow checker considers the map referenced for the entire duration of <code>match</code>. The shared reference returned by <code>HashMap::get</code> points to the item inside the map, which means that while it exists, it is forbidden to create another mutable reference to the same map, which is required by <code>HashMap::insert</code>. For the code to compile, you need to split up the match in order to force the shared reference to go out of scope before <code>insert</code> is invoked:</p> <pre class="prettyprint"><code>// avoids unnecessary clone of the whole map fn value(&mut self, arg: K) -> V { if let Some(v) = self.values.get(&arg).map(V::clone) { return v; } else { let v = (self.calculation)(arg.clone()); self.values.insert(arg, v.clone()); v } } </code></pre> <hr> <p>This is much better and probably "good enough" for most practical purposes. The hot path, where the value is already cached, now consists of only a single clone, and that one is actually necessary because the original value must remain in the hash map. (Also, note that cloning doesn't need to be expensive or imply deep copying - the stored value can be an <code>Rc<RealValue></code>, which buys object sharing for free. In that case, <code>clone()</code> will simply increment the reference count on the object.)</p> <h3>Clone on cache miss</h3> <p>In case of cache miss, the key must be cloned, because <code>calculation</code> is declared to consume it. A single cloning will be sufficient, though, so we can pass the original <code>arg</code> to <code>insert</code> without cloning it again. The key clone still feels unnecessary, though - a calculation function shouldn't require ownership of the key it is transforming. Removing this clone boils down to modifying the signature of the calculation function to take the key by reference. Changing the trait bounds of <code>T</code> to <code>T: Fn(&K) -> V</code> allows the following formulation of <code>value()</code>:</p> <pre class="prettyprint"><code>// avoids unnecessary clone of the key fn value(&mut self, arg: K) -> V { if let Some(v) = self.values.get(&arg).map(V::clone) { return v; } else { let v = (self.calculation)(&arg); self.values.insert(arg, v.clone()); v } } </code></pre> <h3>Avoiding double lookups</h3> <p>Now are left with exactly two calls to <code>clone()</code>, one in each code path. This is optimal, as far as value cloning is concerned, but the careful reader will still be nagged by one detail: in case of cache miss, the hash table lookup will effectively happen <em>twice</em> for the same key: once in the call to <code>HashMap::get</code>, and then once more in <code>HashMap::insert</code>. It would be nice if we could instead reuse the work done the first time and perform only one hash map lookup. This can be achieved by replacing <code>get()</code> and <code>insert()</code> with <code>entry()</code>:</p> <pre class="prettyprint"><code>// avoids the second lookup on cache miss fn value(&mut self, arg: K) -> V { match self.values.entry(arg) { Entry::Occupied(entry) => entry.into_mut(), Entry::Vacant(entry) => { let v = (self.calculation)(entry.key()); entry.insert(v) } }.clone() } </code></pre> <p>We've also taken the opportunity to move the <code>.clone()</code> call after the match.</p> <p>Runnable example in the playground.</p>

<p>I was solving the same exercise and ended with the following code:</p> <pre class="prettyprint"><code>use std::thread; use std::time::Duration; use std::collections::HashMap; use std::hash::Hash; use std::fmt::Display; struct Cacher<P, R, T> where T: Fn(&P) -> R, P: Eq + Hash + Clone, { calculation: T, values: HashMap<P, R>, } impl<P, R, T> Cacher<P, R, T> where T: Fn(&P) -> R, P: Eq + Hash + Clone, { fn new(calculation: T) -> Cacher<P, R, T> { Cacher { calculation, values: HashMap::new(), } } fn value<'a>(&'a mut self, key: P) -> &'a R { let calculation = &self.calculation; let key_copy = key.clone(); self.values .entry(key_copy) .or_insert_with(|| (calculation)(&key)) } } </code></pre> <p>It only makes a single copy of the key in the <code>value()</code> method. It does not copy the resulting value, but instead returns a reference with a lifetime specifier, which is equal to the lifetime of the enclosing <code>Cacher</code> instance (which is logical, I think, because values in the map will continue to exist until the <code>Cacher</code> itself is dropped).</p> <p>Here's a test program:</p> <pre class="prettyprint"><code>fn main() { let mut cacher1 = Cacher::new(|num: &u32| -> u32 { println!("calculating slowly..."); thread::sleep(Duration::from_secs(2)); *num }); calculate_and_print(10, &mut cacher1); calculate_and_print(20, &mut cacher1); calculate_and_print(10, &mut cacher1); let mut cacher2 = Cacher::new(|str: &&str| -> usize { println!("calculating slowly..."); thread::sleep(Duration::from_secs(2)); str.len() }); calculate_and_print("abc", &mut cacher2); calculate_and_print("defghi", &mut cacher2); calculate_and_print("abc", &mut cacher2); } fn calculate_and_print<P, R, T>(intensity: P, cacher: &mut Cacher<P, R, T>) where T: Fn(&P) -> R, P: Eq + Hash + Clone, R: Display, { println!("{}", cacher.value(intensity)); } </code></pre> <p>And its output:</p> <pre class="prettyprint lang-none prettyprint-override"><code>calculating slowly... 10 calculating slowly... 20 10 calculating slowly... 3 calculating slowly... 6 3 </code></pre>

How do I remove excessive `clone` calls from a struct that caches arbitrary results?

Tags:

rust

I am reading the section on closures in the second edition of the Rust book. At the end of this section, there is an exercise to extend the Cacher implementation given before. I gave it a try:

use std::clone::Clone;
use std::cmp::Eq;
use std::collections::HashMap;
use std::hash::Hash;

struct Cacher<T, K, V>
where
    T: Fn(K) -> V,
    K: Eq + Hash + Clone,
    V: Clone,
{
    calculation: T,
    values: HashMap<K, V>,
}

impl<T, K, V> Cacher<T, K, V>
where
    T: Fn(K) -> V,
    K: Eq + Hash + Clone,
    V: Clone,
{
    fn new(calculation: T) -> Cacher<T, K, V> {
        Cacher {
            calculation,
            values: HashMap::new(),
        }
    }

    fn value(&mut self, arg: K) -> V {
        match self.values.clone().get(&arg) {
            Some(v) => v.clone(),
            None => {
                self.values
                    .insert(arg.clone(), (self.calculation)(arg.clone()));
                self.values.get(&arg).unwrap().clone()
            }
        }
    }
}

After creating a version that finally works, I am really unhappy with it. What really bugs me is that cacher.value(...) has 5(!) calls to clone() in it. Is there a way to avoid this?

492

asked Sep 24 '17 10:09

itmuckel

2 Answers

Your suspicion is correct, the code contains too many calls to clone(), defeating the very optimizations Cacher is designed to achieve.

Cloning the entire cache

The one to start with is the call to self.values.clone() - it creates a copy of the entire cache on every single access.

After non-lexical lifetimes

Remove this clone.

Before non-lexical lifetimes

As you likely discovered yourself, simply removing .clone() doesn't compile. This is because the borrow checker considers the map referenced for the entire duration of match. The shared reference returned by HashMap::get points to the item inside the map, which means that while it exists, it is forbidden to create another mutable reference to the same map, which is required by HashMap::insert. For the code to compile, you need to split up the match in order to force the shared reference to go out of scope before insert is invoked:

// avoids unnecessary clone of the whole map
fn value(&mut self, arg: K) -> V {
    if let Some(v) = self.values.get(&arg).map(V::clone) {
        return v;
    } else {
        let v = (self.calculation)(arg.clone());
        self.values.insert(arg, v.clone());
        v
    }
}

This is much better and probably "good enough" for most practical purposes. The hot path, where the value is already cached, now consists of only a single clone, and that one is actually necessary because the original value must remain in the hash map. (Also, note that cloning doesn't need to be expensive or imply deep copying - the stored value can be an Rc<RealValue>, which buys object sharing for free. In that case, clone() will simply increment the reference count on the object.)

Clone on cache miss

In case of cache miss, the key must be cloned, because calculation is declared to consume it. A single cloning will be sufficient, though, so we can pass the original arg to insert without cloning it again. The key clone still feels unnecessary, though - a calculation function shouldn't require ownership of the key it is transforming. Removing this clone boils down to modifying the signature of the calculation function to take the key by reference. Changing the trait bounds of T to T: Fn(&K) -> V allows the following formulation of value():

// avoids unnecessary clone of the key
fn value(&mut self, arg: K) -> V {
    if let Some(v) = self.values.get(&arg).map(V::clone) {
        return v;
    } else {
        let v = (self.calculation)(&arg);
        self.values.insert(arg, v.clone());
        v
    }
}

Avoiding double lookups

Now are left with exactly two calls to clone(), one in each code path. This is optimal, as far as value cloning is concerned, but the careful reader will still be nagged by one detail: in case of cache miss, the hash table lookup will effectively happen twice for the same key: once in the call to HashMap::get, and then once more in HashMap::insert. It would be nice if we could instead reuse the work done the first time and perform only one hash map lookup. This can be achieved by replacing get() and insert() with entry():

// avoids the second lookup on cache miss
fn value(&mut self, arg: K) -> V {
    match self.values.entry(arg) {
        Entry::Occupied(entry) => entry.into_mut(),
        Entry::Vacant(entry) => {
            let v = (self.calculation)(entry.key());
            entry.insert(v)
        }
    }.clone()
}

We've also taken the opportunity to move the .clone() call after the match.

Runnable example in the playground.

answered Nov 15 '22 11:11

user4815162342

I was solving the same exercise and ended with the following code:

use std::thread;
use std::time::Duration;
use std::collections::HashMap;
use std::hash::Hash;
use std::fmt::Display;

struct Cacher<P, R, T>
where
    T: Fn(&P) -> R,
    P: Eq + Hash + Clone,
{
    calculation: T,
    values: HashMap<P, R>,
}

impl<P, R, T> Cacher<P, R, T>
where
    T: Fn(&P) -> R,
    P: Eq + Hash + Clone,
{
    fn new(calculation: T) -> Cacher<P, R, T> {
        Cacher {
            calculation,
            values: HashMap::new(),
        }
    }

    fn value<'a>(&'a mut self, key: P) -> &'a R {
        let calculation = &self.calculation;
        let key_copy = key.clone();
        self.values
            .entry(key_copy)
            .or_insert_with(|| (calculation)(&key))
    }
}

It only makes a single copy of the key in the value() method. It does not copy the resulting value, but instead returns a reference with a lifetime specifier, which is equal to the lifetime of the enclosing Cacher instance (which is logical, I think, because values in the map will continue to exist until the Cacher itself is dropped).

Here's a test program:

fn main() {
    let mut cacher1 = Cacher::new(|num: &u32| -> u32 {
        println!("calculating slowly...");
        thread::sleep(Duration::from_secs(2));
        *num
    });

    calculate_and_print(10, &mut cacher1);
    calculate_and_print(20, &mut cacher1);
    calculate_and_print(10, &mut cacher1);

    let mut cacher2 = Cacher::new(|str: &&str| -> usize {
        println!("calculating slowly...");
        thread::sleep(Duration::from_secs(2));
        str.len()
    });

    calculate_and_print("abc", &mut cacher2);
    calculate_and_print("defghi", &mut cacher2);
    calculate_and_print("abc", &mut cacher2);
}

fn calculate_and_print<P, R, T>(intensity: P, cacher: &mut Cacher<P, R, T>)
where
    T: Fn(&P) -> R,
    P: Eq + Hash + Clone,
    R: Display,
{
    println!("{}", cacher.value(intensity));
}

And its output:

calculating slowly...
10
calculating slowly...
20
10
calculating slowly...
3
calculating slowly...
6
3

answered Nov 15 '22 10:11

Andrei Tomashpolskiy

Related questions
                            
                                How to instantiate a public tuple struct(with private field) from a different module?
                            
                                Do lifetime annotations in Rust change the lifetime of the variables?
                            
                                How to concatenate static arrays in rust?
                            
                                How to get the index of the current element being processed in the iteration without a for loop?
                            
                                Why does rust's read_line function use a mutable reference instead of a return value?
                            
                                Rust `std::time::Instant` "panicked at 'supplied instant is later than self"
                            
                                How can I pull data out of an Option for independent use?
                            
                                Idiomatic way to choose the first Some() Option?
                            
                                Casting to `*mut` overrules the reference not being `mut`
                            
                                How do I declare an instance of one of my Rust structs as static? [duplicate]
                            
                                Lifetime issue with the Send trait
                            
                                Do rust lifetimes only refer to references?
                            
                                Structure containing fields that know each other
                            
                                Can a while loop evaluate to a value?
                            
                                Why does the compiler tell me to consider using a `let` binding" when I already am?
                            
                                When to use Box<Vec<..>> or Vec<Box<..>>?
                            
                                What's a nice way to create a new vector by consuming two vectors?
                            
                                Match shadowing example in the Patterns section of the Rust book is very perplexing
                            
                                Cannot use Rayon's `.par_iter()`
                            
                                Ownership tracking in Rust: Difference between Box<T> (heap) and T (stack)

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With