Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does println! interact with multiple levels of indirection?

Tags:

pointers

rust

I have the following program:

fn main() {
    let x = 0;

    println!("Example 1: {:p}", &x);
    println!("Example 1: {:p}", &x);

    println!("Example 2: {:p}", &&x);
    println!("Example 2: {:p}", &&x);
}

Here is an example output:

Example 1: 0x7ffcb4e72144
Example 1: 0x7ffcb4e72144
Example 2: 0x7ffcb4e72238
Example 2: 0x7ffcb4e72290

The outputs for "Example 1" are consistently the same, while those for "Example 2" are consistently different.

I've read Does println! borrow or own the variable?, and what I understood from the given answer is that println! takes a reference silently. In other words, this sounds like println! adds an extra level of indirection.

I had expected the outputs for "Example 1" to be different as well. Seeing that println! silently takes another level of indirection, "Example 1" is actually working with &&x, and "Example 2" is working with &&&x. This seems to agree with the answer I linked, specifically: "If you write println!("{}", &x), you are then dealing with two levels of references".

I thought that the value &&x holds would be printed for "Example 1", while the value &&&x holds would be printed for "Example 2". Both &&x and &&&x hold a temporary &x, so I thought "Example 1" would have different addresses printed as well.

Where am I wrong? Why doesn't "Example 1" have different addresses printed?

like image 532
Mario Ishac Avatar asked Jun 14 '20 01:06

Mario Ishac


1 Answers

Let's start with a trick question: Does this compile or not?

fn main() {
    println!("{:p}", 1i32);
}

We're asking to print an i32 as a memory address. Does this make sense?

No, of course, and Rust rightfully rejects this program.

error[E0277]: the trait bound `i32: std::fmt::Pointer` is not satisfied
 --> src/main.rs:2:22
  |
2 |     println!("{:p}", 1i32);
  |                      ^^^^ the trait `std::fmt::Pointer` is not implemented for `i32`
  |
  = note: required by `std::fmt::Pointer::fmt`
  = note: this error originates in a macro (in Nightly builds, run with -Z macro-backtrace for more info)

But we know that the macro implicitly borrow the arguments, so 1i32 becomes &1i32. And references do implement Pointer. So what's the deal?

First, it helps to understand why the macro borrows its arguments. Have you ever noticed that all the formatting traits look virtually identical? They all define exactly one method, named fmt, that takes two parameters, &self and a &mut Formatter and returns Result<(), fmt::Error>.

It is the &self that is relevant here. In order to call fmt, we only need a reference to the value, because formatting a value doesn't need ownership of that value. Now, the implementation of formatting arguments is more complicated than this, but ultimately, for an argument x, the program would end up calling std::fmt::Pointer::fmt(&x, formatter) (for :p). However, for this call to compile successfully, the type of x must implement Pointer, not the type of &x. If x is 1i32, then the type of x is i32, and i32 doesn't implement Pointer.

The conclusion is that the :p format will end up printing the value of the pointer represented by the expression written textually in your program. The borrow taken on that expression is there so that the macro doesn't take ownership of the argument (which is still useful for :p, e.g. if you wanted to print a Box<T>).


Now we can proceed to explaining the behavior of your program. x is a local variable. Local variables usually1 have a stable address2. In your Example 1 calls, the expression &x allows us to observe that address. Both occurrences of &x will give the same result because x hasn't moved between the calls. What's printed is the address of x (i.e. the address that holds the value 0).

However, the expression &&x is a bit curious. What does it mean exactly to take the address twice? The subexpression &x produces a temporary value, because the result is not assigned to a variable. Then, we ask the address of that temporary value. Rust is kind enough to let us do that, but that means we must store the temporary value somewhere in memory in order for it to have some address. Here, the temporary value is stored in some hidden local variable.

It turns out that in debug builds, the compiler creates a separate hidden variable for each of the &x subexpressions in the two occurrences of &&x. That's why we can observe two different memory addresses for the Example 2 lines. However, in release builds, the code is optimized so that only one hidden variable is created (because at the point where we need the second one, we no longer need the first one, so we can reuse its memory location), so the two Example 2 lines actually print the same memory address!


1 I say usually because there might be situations where an optimizer could decide to move a local variable around in memory. I don't know if any optimizer actually does that in practice.

2 Some local variables might not have an "address" at all! An optimizer may decide to keep a local variable in a register instead if the address of that variable is never observed. On many processor architectures, registers cannot be addressed by a pointer, because they live in a different "address space", so to speak. Of course, here, we are observing the address, so we can be pretty confident that the variable actually lives on the stack.

like image 98
Francis Gagné Avatar answered Sep 20 '22 14:09

Francis Gagné