Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the Rust equivalent of C++'s shared_ptr?

Why is this syntax not allowed in Rust:

fn main() {
    let a = String::from("ping");
    let b = a;

    println!("{{{}, {}}}", a, b);
}

When I tried to compile this code, I got:

error[E0382]: use of moved value: `a`
 --> src/main.rs:5:28
  |
3 |     let b = a;
  |         - value moved here
4 | 
5 |     println!("{{{}, {}}}", a, b);
  |                            ^ value used here after move
  |
  = note: move occurs because `a` has type `std::string::String`, which does not implement the `Copy` trait

In fact, we can simply make a reference - which does not the same at runtime:

fn main() {
    let a = String::from("ping");
    let b = &a;

    println!("{{{}, {}}}", a, b);
}

And it works:

{ping, ping}

According to the Rust Book it is to avoid double free bugs because Rust's variables are copied by reference instead of by value. Rust will simply invalidate the first object and make it unusable...

enter image description here

We have to do something like this:

enter image description here

I like the idea of copying by reference, but why automatically invalidate the first one?

It should be possible to avoid the double free with a different method. For example, C++ already has a great tool to allow multiple free calls... The shared_ptr calls free only when no other pointer points to the object - it seems to be very similar to what we are actually doing, with the difference that shared_ptr has a counter.

For example, we can count the number of references to each object during the compilation time and call free only when the last reference goes out of the scope.

But Rust is a young language; maybe they haven't had the time to implement something similar? Has Rust planned to allow a second reference to an object without invalidate the first one or should we take the habit to only work with a reference of a reference?

like image 752
tirz Avatar asked Apr 14 '18 17:04

tirz


2 Answers

Either Rc or Arc is the replacement for shared_ptr. Which you choose depends on what level of thread-safety you need for the shared data; Rc is for non-threaded cases and Arc is when you need threads:

use std::rc::Rc;

fn main() {
    let a = Rc::new(String::from("ping"));
    let b = a.clone();

    println!("{{{}, {}}}", a, b);
}

Like shared_ptr, this does not copy the String itself. It only increases a reference counter at runtime when clone is called and decreases the counter when each copy goes out of scope.

Unlike shared_ptr, Rc and Arc have better thread semantics. shared_ptr is semi-thread-safe. shared_ptr's reference counter itself is thread-safe, but the shared data is not "magically" made thread safe.

If you use shared_ptr in a threaded program, you still have more work to do to ensure it's safe. In a non-threaded program, you are paying for some thread-safety you don't need.

If you wish to allow mutating the shared value, you will need to switch to runtime borrow checking as well. This is provided by types like Cell, RefCell, Mutex etc. RefCell is appropriate for String and Rc:

use std::cell::RefCell;
use std::rc::Rc;

fn main() {
    let a = Rc::new(RefCell::new(String::from("ping")));
    let b = a.clone();

    println!("{{{}, {}}}", a.borrow(), b.borrow());

    a.borrow_mut().push_str("pong");
    println!("{{{}, {}}}", a.borrow(), b.borrow());
}

we can count the number of references to each object during the compilation time and call free only when the last reference goes out of the scope.

That's almost exactly what Rust does with references. It doesn't actually use a counter, but it only lets you use references to a value while that value is guaranteed to remain at the same memory address.

C++'s shared_ptr does not do this at compile-time. shared_ptr, Rc, and Arc are all runtime constructs that maintain a counter.

Is it possible to make a reference to the object without invalidate the first reference?

That's exactly what Rust does with references, and what you've already done:

fn main() {
    let a = String::from("ping");
    let b = &a;

    println!("{{{}, {}}}", a, b);
}

Even better, the compiler will stop you from using b as soon as a is no longer valid.

because Rust's variables are copied by reference instead of by value

This is not true. When you assign a value, ownership of the value is transferred to the new variable. Semantically, the memory address of the variable has changed and thus reading that address could lead to memory unsafety.

should we take the habit to only work with a reference

Yes, using references, when possible, is the most idiomatic choice. These require zero runtime overhead and the compiler will tell you about errors, as opposed to encountering them at runtime.

There's certainly times where Rc or Arc are useful. Often they are needed for cyclic data structures. You shouldn't feel bad about using them if you cannot get plain references to work.

with a reference of a reference?

This is a bit of a downside, as the extra indirection is unfortunate. If you really needed to, you can reduce it. If you don't need to modify the string, you can switch to an Rc<str> instead:

use std::rc::Rc;

fn main() {
    let a: Rc<str> = Rc::from("ping");
    let b = a.clone();

    println!("{{{}, {}}}", a, b);
}

If you need to keep the ability to modify the String sometimes, you can also explicitly convert a &Rc<T> to a &T:

use std::rc::Rc;

fn main() {
    let a = Rc::new(String::from("ping"));
    let b = a.clone();

    let a_s: &str = &*a;
    let b_s: &str = &*b;

    println!("{{{}, {}}}", a_s, b_s);
}

See also:

  • What is the right smart pointer to have multiple strong references and allow mutability?
  • When I can use either Cell or RefCell, which should I choose?
  • Situations where Cell or RefCell is the best choice
  • Why is it discouraged to accept a reference to a String (&String), Vec (&Vec) or Box (&Box) as a function argument?
  • How to build an Rc<str> or Rc<[T]>?
  • CppCon 2017: Louis Brandy “Curiously Recurring C++ Bugs at Facebook”
like image 116
Shepmaster Avatar answered Sep 28 '22 10:09

Shepmaster


Maybe we can simply count the number of references to each object during the compile time and call free only when the last reference goes out of the scope.

You're on the right track! This is what Rc is for. It's a smart pointer type very much like std::shared_ptr in C++. It frees memory only after the last pointer instance has gone out of scope:

use std::rc::Rc;

fn main() {
    let a = Rc::new(String::from("ping"));

    // clone() here does not copy the string; it creates another pointer
    // and increments the reference count
    let b = a.clone();

    println!("{{{}, {}}}", *a, *b);
}

Since you only get immutable access to Rc's contents (it's shared after all, and shared mutability is forbidden in Rust) you need interior mutability to be able to change its contents, implemented via Cell or RefCell:

use std::rc::Rc;
use std::cell::RefCell;

fn main() {
    let a = Rc::new(RefCell::new(String::from("Hello")));
    let b = a.clone();

    a.borrow_mut() += ", World!";

    println!("{}", *b); // Prints "Hello, World!"
}

But most of the time, you shouldn't need to use Rc (or its thread safe brother Arc) at all. Rust's ownership model mostly allows you to avoid the overhead of reference counting by declaring the String instance in one place and using references to it everywhere else, just like you did in your second snippet. Try focusing on that and use Rc only if really necessary, such as when you implement a graph-like structure.

like image 38
Fabian Knorr Avatar answered Sep 28 '22 08:09

Fabian Knorr