I'm learning/experimenting with Rust, and in all the elegance that I find in this language, there is one peculiarity that baffles me and seems totally out of place.
Rust automatically dereferences pointers when making method calls. I made some tests to determine the exact behaviour:
struct X { val: i32 } impl std::ops::Deref for X { type Target = i32; fn deref(&self) -> &i32 { &self.val } } trait M { fn m(self); } impl M for i32 { fn m(self) { println!("i32::m()"); } } impl M for X { fn m(self) { println!("X::m()"); } } impl M for &X { fn m(self) { println!("&X::m()"); } } impl M for &&X { fn m(self) { println!("&&X::m()"); } } impl M for &&&X { fn m(self) { println!("&&&X::m()"); } } trait RefM { fn refm(&self); } impl RefM for i32 { fn refm(&self) { println!("i32::refm()"); } } impl RefM for X { fn refm(&self) { println!("X::refm()"); } } impl RefM for &X { fn refm(&self) { println!("&X::refm()"); } } impl RefM for &&X { fn refm(&self) { println!("&&X::refm()"); } } impl RefM for &&&X { fn refm(&self) { println!("&&&X::refm()"); } } struct Y { val: i32 } impl std::ops::Deref for Y { type Target = i32; fn deref(&self) -> &i32 { &self.val } } struct Z { val: Y } impl std::ops::Deref for Z { type Target = Y; fn deref(&self) -> &Y { &self.val } } #[derive(Clone, Copy)] struct A; impl M for A { fn m(self) { println!("A::m()"); } } impl M for &&&A { fn m(self) { println!("&&&A::m()"); } } impl RefM for A { fn refm(&self) { println!("A::refm()"); } } impl RefM for &&&A { fn refm(&self) { println!("&&&A::refm()"); } } fn main() { // I'll use @ to denote left side of the dot operator (*X{val:42}).m(); // i32::m() , Self == @ X{val:42}.m(); // X::m() , Self == @ (&X{val:42}).m(); // &X::m() , Self == @ (&&X{val:42}).m(); // &&X::m() , Self == @ (&&&X{val:42}).m(); // &&&X:m() , Self == @ (&&&&X{val:42}).m(); // &&&X::m() , Self == *@ (&&&&&X{val:42}).m(); // &&&X::m() , Self == **@ println!("-------------------------"); (*X{val:42}).refm(); // i32::refm() , Self == @ X{val:42}.refm(); // X::refm() , Self == @ (&X{val:42}).refm(); // X::refm() , Self == *@ (&&X{val:42}).refm(); // &X::refm() , Self == *@ (&&&X{val:42}).refm(); // &&X::refm() , Self == *@ (&&&&X{val:42}).refm(); // &&&X::refm(), Self == *@ (&&&&&X{val:42}).refm(); // &&&X::refm(), Self == **@ println!("-------------------------"); Y{val:42}.refm(); // i32::refm() , Self == *@ Z{val:Y{val:42}}.refm(); // i32::refm() , Self == **@ println!("-------------------------"); A.m(); // A::m() , Self == @ // without the Copy trait, (&A).m() would be a compilation error: // cannot move out of borrowed content (&A).m(); // A::m() , Self == *@ (&&A).m(); // &&&A::m() , Self == &@ (&&&A).m(); // &&&A::m() , Self == @ A.refm(); // A::refm() , Self == @ (&A).refm(); // A::refm() , Self == *@ (&&A).refm(); // A::refm() , Self == **@ (&&&A).refm(); // &&&A::refm(), Self == @ }
(Playground)
So, it seems that, more or less:
&self
(call-by-reference): self
self
self
(call-by-value) for type T
behave as if they were declared using &self
(call-by-reference) for type &T
and called on the reference to whatever is on the left side of the dot operator.Deref
trait is used.What are the exact auto-dereferencing rules? Can anyone give any formal rationale for such a design decision?
Rust will also insert automatic dereferencing as part of deref coercion. This is a special coercion that can only convert from one reference type to another. For example, this is what lets you convert from String to &str by writing &x instead of &*x .
The dereference operator is also known as the indirection operator. Simply put, the dereferencing operator allows us to get the value stored in the memory address of a pointer. In Rust, we use the Deref trait to customize the behaviour of the dereferencing operator.
In general, &* means to first dereference ( * ) and then reference ( & ) a value.
Your pseudo-code is pretty much correct. For this example, suppose we had a method call foo.bar()
where foo: T
. I'm going to use the fully qualified syntax (FQS) to be unambiguous about what type the method is being called with, e.g. A::bar(foo)
or A::bar(&***foo)
. I'm just going to write a pile of random capital letters, each one is just some arbitrary type/trait, except T
is always the type of the original variable foo
that the method is called on.
The core of the algorithm is:
U
(that is, set U = T
and then U = *T
, ...) bar
where the receiver type (the type of self
in the method) matches U
exactly , use it (a "by value method")&
or &mut
of the receiver), and, if some method's receiver matches &U
, use it (an "autorefd method")Notably, everything considers the "receiver type" of the method, not the Self
type of the trait, i.e. impl ... for Foo { fn method(&self) {} }
thinks about &Foo
when matching the method, and fn method2(&mut self)
would think about &mut Foo
when matching.
It is an error if there's ever multiple trait methods valid in the inner steps (that is, there can be only be zero or one trait methods valid in each of 1. or 2., but there can be one valid for each: the one from 1 will be taken first), and inherent methods take precedence over trait ones. It's also an error if we get to the end of the loop without finding anything that matches. It is also an error to have recursive Deref
implementations, which make the loop infinite (they'll hit the "recursion limit").
These rules seem to do-what-I-mean in most circumstances, although having the ability to write the unambiguous FQS form is very useful in some edge cases, and for sensible error messages for macro-generated code.
Only one auto-reference is added because
&foo
retains a strong connection to foo
(it is the address of foo
itself), but taking more starts to lose it: &&foo
is the address of some temporary variable on the stack that stores &foo
.Suppose we have a call foo.refm()
, if foo
has type:
X
, then we start with U = X
, refm
has receiver type &...
, so step 1 doesn't match, taking an auto-ref gives us &X
, and this does match (with Self = X
), so the call is RefM::refm(&foo)
&X
, starts with U = &X
, which matches &self
in the first step (with Self = X
), and so the call is RefM::refm(foo)
&&&&&X
, this doesn't match either step (the trait isn't implemented for &&&&X
or &&&&&X
), so we dereference once to get U = &&&&X
, which matches 1 (with Self = &&&X
) and the call is RefM::refm(*foo)
Z
, doesn't match either step so it is dereferenced once, to get Y
, which also doesn't match, so it's dereferenced again, to get X
, which doesn't match 1, but does match after autorefing, so the call is RefM::refm(&**foo)
.&&A
, the 1. doesn't match and neither does 2. since the trait is not implemented for &A
(for 1) or &&A
(for 2), so it is dereferenced to &A
, which matches 1., with Self = A
Suppose we have foo.m()
, and that A
isn't Copy
, if foo
has type:
A
, then U = A
matches self
directly so the call is M::m(foo)
with Self = A
&A
, then 1. doesn't match, and neither does 2. (neither &A
nor &&A
implement the trait), so it is dereferenced to A
, which does match, but M::m(*foo)
requires taking A
by value and hence moving out of foo
, hence the error.&&A
, 1. doesn't match, but autorefing gives &&&A
, which does match, so the call is M::m(&foo)
with Self = &&&A
.(This answer is based on the code, and is reasonably close to the (slightly outdated) README. Niko Matsakis, the main author of this part of the compiler/language, also glanced over this answer.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With