I am learning concurrency and want to clarify my understanding on the following code example from the Rust book. Please correct me if I am wrong. <pre class="prettyprint"><code>use std::sync::{Arc, Mutex}; use std::thread; use std::time::Duration; fn main() { let data = Arc::new(Mutex::new(vec![1, 2, 3])); for i in 0..3 { let data = data.clone(); thread::spawn(move || { let mut data = data.lock().unwrap(); data[0] += i; }); } thread::sleep(Duration::from_millis(50)); } </code></pre> What is happening on the line <code>let data = data.clone()</code>? The Rust book says <blockquote> we use <code>clone()</code> to create a new owned handle. This handle is then moved into the new thread. </blockquote> What is the new "owned handle"? It sounds like a reference to the data? Since <code>clone</code> takes a <code>&self</code> and returns a <code>Self</code>, is each thread modifying the original data instead of a copy? I guess that is why the code is not using <code>data.copy()</code> but <code>data.clone()</code> here. The <code>data</code> on the right side is a reference, and the <code>data</code> on the left is a owned value. There is a variable shadowing here.

<blockquote> [...] what is happening on <code>let data = data.clone()</code>? </blockquote> <code>Arc</code> stands for Atomically Reference Counted. An <code>Arc</code> manages one object (of type <code>T</code>) and serves as a proxy to allow for shared ownership, meaning: one object is owned by multiple names. Wow, that sounds abstract, let's break it down! <h3>Shared Ownership</h3> Let's say you have an object of type <code>Turtle</code> 🐢 which you bought for your family. Now the problem arises that you can't assign a clear owner of the turtle: every family-member kind of owns that pet! This means (and sorry for being morbid here) that if one member of the family dies, the turtle won't die with that family-member. The turtle will only die if all members of the family are gone as well. Everyone owns and the last one cleans up. So how would you express that kind of shared ownership in Rust? You will quickly notice that it's impossible to do with only standard methods: you'd always have to choose one owner and everyone else would only have a reference to the turtle. Not good! So along come <code>Rc</code> and <code>Arc</code> (which, for the sake of this story, serve the exact same purpose). These allow for shared ownership by tinkering a bit with unsafe-Rust. Let's look at the memory after executing the following code (note: the memory layout is for learning and might not represent the exact same memory layout from the real world): <pre class="prettyprint"><code>let annas = Rc::new(Turtle { legs: 4 }); </code></pre> Memory: <pre class="prettyprint lang-none prettyprint-override"><code> Stack Heap ----- ---- annas: +--------+ +------------+ | ptr: o-|-------------->| count: 1 | +--------+ | data: 🐢 | +------------+ </code></pre> We see that the turtle lives on the heap... next to a counter which is set to 1. This counter knows how many owners the object <code>data</code> currently has. And 1 is correct: <code>annas</code> is the only one owning the turtle right now. Let's <code>clone()</code> the <code>Rc</code> to get more owners: <pre class="prettyprint"><code>let peters = annas.clone(); let bobs = annas.clone(); </code></pre> Now the memory looks like this: <pre class="prettyprint lang-none prettyprint-override"><code> Stack Heap ----- ---- annas: +--------+ +------------+ | ptr: o-|-------------->| count: 3 | +--------+ ^ | data: 🐢 | | +------------+ peters: | +--------+ | | ptr: o-|----+ +--------+ ^ | bobs: | +--------+ | | ptr: o-|----+ +--------+ </code></pre> As you can see, the turtle still exists only once. But the reference count was increased and is now 3, which makes sense, because the turtle has three owners now. All those three owners reference this memory block on the heap. That's what the Rust book calls owned handle: each owner of such a handle also kind of owns the underlying object. (also see "Why is <code>std::rc::Rc<></code> not Copy?") <h3>Atomicity and Mutability</h3> What's the difference between <code>Arc<T></code> and <code>Rc<T></code> you ask? The <code>Arc</code> increments and decrements its counter in an atomic fashion. That means that multiple threads can increment and decrement the counter simultaneously without a problem. That's why you can send <code>Arc</code>s across thread-boundaries, but not <code>Rc</code>s. Now you notice that you can't mutate the data through an <code>Arc<T></code>! What if your 🐢 loses a leg? <code>Arc</code> is not designed to allow mutable access from multiple owners at (possibly) the same time. That's why you often see types like <code>Arc<Mutex<T>></code>. The <code>Mutex<T></code> is a type that offers interior mutability, which means that you can get a <code>&mut T</code> from a <code>&Mutex<T></code>! This would normally conflict with the Rust core principles, but it's perfectly safe because the mutex also manages access: you have to request access to the object. If another thread/source currently has access to the object, you have to wait. Therefore, at one given moment in time, there is only one thread able to access <code>T</code>. <h3>Conclusion</h3> <blockquote> [...] is each thread modifying the original data instead of a copy? </blockquote> As you can hopefully understand from the explanation above: yes, each thread is modifying the original data. A <code>clone()</code> on an <code>Arc<T></code> won't clone the <code>T</code>, but merely create another owned handle; which in turn is just a pointer that behaves as if it owns the underlying object.

What happens when an Arc is cloned?

Tags:

clone

concurrency

rust

I am learning concurrency and want to clarify my understanding on the following code example from the Rust book. Please correct me if I am wrong.

use std::sync::{Arc, Mutex}; use std::thread; use std::time::Duration;  fn main() {     let data = Arc::new(Mutex::new(vec![1, 2, 3]));      for i in 0..3 {         let data = data.clone();         thread::spawn(move || {             let mut data = data.lock().unwrap();             data[0] += i;         });     }      thread::sleep(Duration::from_millis(50)); }

What is happening on the line let data = data.clone()?

The Rust book says

we use clone() to create a new owned handle. This handle is then moved into the new thread.

What is the new "owned handle"? It sounds like a reference to the data?

Since clone takes a &self and returns a Self, is each thread modifying the original data instead of a copy? I guess that is why the code is not using data.copy() but data.clone() here.

The data on the right side is a reference, and the data on the left is a owned value. There is a variable shadowing here.

335

asked Dec 05 '16 23:12

enaJ

1 Answers

[...] what is happening on let data = data.clone()?

Arc stands for Atomically Reference Counted. An Arc manages one object (of type T) and serves as a proxy to allow for shared ownership, meaning: one object is owned by multiple names. Wow, that sounds abstract, let's break it down!

Shared Ownership

Let's say you have an object of type Turtle 🐢 which you bought for your family. Now the problem arises that you can't assign a clear owner of the turtle: every family-member kind of owns that pet! This means (and sorry for being morbid here) that if one member of the family dies, the turtle won't die with that family-member. The turtle will only die if all members of the family are gone as well. Everyone owns and the last one cleans up.

So how would you express that kind of shared ownership in Rust? You will quickly notice that it's impossible to do with only standard methods: you'd always have to choose one owner and everyone else would only have a reference to the turtle. Not good!

So along come Rc and Arc (which, for the sake of this story, serve the exact same purpose). These allow for shared ownership by tinkering a bit with unsafe-Rust. Let's look at the memory after executing the following code (note: the memory layout is for learning and might not represent the exact same memory layout from the real world):

let annas = Rc::new(Turtle { legs: 4 });

Memory:

  Stack                    Heap   -----                    ----     annas: +--------+               +------------+ | ptr: o-|-------------->| count: 1   | +--------+               | data: 🐢   |                          +------------+

We see that the turtle lives on the heap... next to a counter which is set to 1. This counter knows how many owners the object data currently has. And 1 is correct: annas is the only one owning the turtle right now. Let's clone() the Rc to get more owners:

let peters = annas.clone(); let bobs = annas.clone();

Now the memory looks like this:

  Stack                    Heap   -----                    ----     annas: +--------+               +------------+ | ptr: o-|-------------->| count: 3   | +--------+    ^          | data: 🐢   |               |          +------------+  peters:      | +--------+    | | ptr: o-|----+ +--------+    ^               |   bobs:       | +--------+    | | ptr: o-|----+ +--------+

As you can see, the turtle still exists only once. But the reference count was increased and is now 3, which makes sense, because the turtle has three owners now. All those three owners reference this memory block on the heap. That's what the Rust book calls owned handle: each owner of such a handle also kind of owns the underlying object.

(also see "Why is std::rc::Rc<> not Copy?")

Atomicity and Mutability

What's the difference between Arc<T> and Rc<T> you ask? The Arc increments and decrements its counter in an atomic fashion. That means that multiple threads can increment and decrement the counter simultaneously without a problem. That's why you can send Arcs across thread-boundaries, but not Rcs.

Now you notice that you can't mutate the data through an Arc<T>! What if your 🐢 loses a leg? Arc is not designed to allow mutable access from multiple owners at (possibly) the same time. That's why you often see types like Arc<Mutex<T>>. The Mutex<T> is a type that offers interior mutability, which means that you can get a &mut T from a &Mutex<T>! This would normally conflict with the Rust core principles, but it's perfectly safe because the mutex also manages access: you have to request access to the object. If another thread/source currently has access to the object, you have to wait. Therefore, at one given moment in time, there is only one thread able to access T.

Conclusion

[...] is each thread modifying the original data instead of a copy?

As you can hopefully understand from the explanation above: yes, each thread is modifying the original data. A clone() on an Arc<T> won't clone the T, but merely create another owned handle; which in turn is just a pointer that behaves as if it owns the underlying object.

115

answered Sep 19 '22 01:09

Lukas Kalbertodt

Related questions
                            
                                anonymous struct and empty struct
                            
                                Latch that can be incremented
                            
                                Can volatile variable be defined as static in java?
                            
                                What is the lifecycle and concurrency semantics of Rhino Script Engine
                            
                                What is an "incompletely constructed object"?
                            
                                Does "SELECT FOR UPDATE" prevent other connections inserting when the row is not present?
                            
                                Blocking calls in Akka Actors
                            
                                Why does ConcurrentDictionary.GetOrAdd(key, valueFactory) allow the valueFactory to be invoked twice?
                            
                                Core Data: Do child contexts ever get permanent objectIDs for newly inserted objects?
                            
                                Models of concurrency in nodejs
                            
                                Running code on the main thread from a secondary thread?
                            
                                TPL Dataflow, guarantee completion only when ALL source data blocks completed
                            
                                How to avoid deadlocks?
                            
                                What are the different ways for calling my method on separate thread?
                            
                                Concurrent Priority Queue in .NET 4.0
                            
                                UNIX Portable Atomic Operations
                            
                                Why doesn't ConcurrentBag<T> implement ICollection<T>?
                            
                                Is SELECT or INSERT in a function prone to race conditions?
                            
                                Transaction deadlock for select query
                            
                                Java memory model - can someone explain it?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With