I believe that this function declaration tells Rust that the lifetime of the function's output is the same as the lifetime of it's s
parameter:
fn substr<'a>(s: &'a str, until: u32) -> &'a str; ^^^^
It seems to me that the compiler only needs to know this(1):
fn substr(s: &'a str, until: u32) -> &'a str;
What does the annotation <'a>
after the function name mean? Why does the compiler need it, and what does it do with it?
(1): I know it needs to know even less, due to lifetime elision. But this question is about specifying lifetime explicitly.
Lifetimes are what the Rust compiler uses to keep track of how long references are valid for. Checking references is one of the borrow checker's main responsibilities. Lifetimes help the borrow checker ensure that you never have invalid references.
A lifetime is a construct the compiler (or more specifically, its borrow checker) uses to ensure all borrows are valid. Specifically, a variable's lifetime begins when it is created and ends when it is destroyed. While lifetimes and scopes are often referred to together, they are not the same.
Rust uses lifetime parameters to avoid such potential run-time errors. Since the compiler doesn't know in advance whether the if or the else block will execute, the code above won't compile and an error message will be printed that says, “a lifetime parameter is expected in compare 's signature.”
Let me expand on the previous answers…
What does the annotation <'a> after the function name mean?
I wouldn't use the word "annotation" for that. Much like <T>
introduces a generic type parameter, <'a>
introduces a generic lifetime parameter. You can't use any generic parameters without introducing them first and for generic functions this introduction happens right after their name. You can think of a generic function as a family of functions. So, essentially, you get one function for every combination of generic parameters. substr::<'x>
would be a specific member of that function family for some lifetime 'x
.
If you're unclear on when and why we have to be explicit about lifetimes, read on…
A lifetime parameter is always associated with all reference types. When you write
fn main() { let x = 28374; let r = &x; }
the compiler knows that x lives in the main function's scope enclosed with curly braces. Internally, it identifies this scope with some lifetime parameter. For us, it is unnamed. When you take the address of x
, you'll get a value of a specific reference type. A reference type is kind of a member of a two dimensional family of reference types. One axis is the type of what the reference points to and the other axis is a lifetime that is used for two constraints:
Together, these constraints play a vital role in Rust's memory safety story. The goal here is to avoid dangling references. We would like to rule out references that point to some memory region we are not allowed to use anymore because that thing it used to point to does not exist anymore.
One potential source of confusion is probably the fact that lifetime parameters are invisible most of the time. But that does not mean they are not there. References always have a lifetime parameter in their type. But such a lifetime parameter does not have to have a name and most of the time we don't need to mention it anyways because the compiler can assign names for lifetime parameters automatically. This is called "lifetime elision". For example, in the following case, you don't see any lifetime parameters being mentioned:
fn substr(s: &str, until: u32) -> &str {…}
But it's okay to write it like this. It's actually a short-cut syntax for the more explicit
fn substr<'a>(s: &'a str, until: u32) -> &'a str {…}
Here, the compiler automatically assigns the same name to the "input lifetime" and the "output lifetime" because it's a very common pattern and most likely exactly what you want. Because this pattern is so common, the compiler lets us get away without saying anything about lifetimes. It assumes that this more explicit form is what we meant based on a couple of "lifetime elision" rules (which are at least documented here)
There are situations in which explicit lifetime parameters are not optional. For example, if you write
fn min<T: Ord>(x: &T, y: &T) -> &T { if x <= y { x } else { y } }
the compiler will complain because it will interpret the above declaration as
fn min<'a, 'b, 'c, T: Ord>(x: &'a T, y: &'b T) -> &'c T { … }
So, for each reference a separate lifetime parameter is introduced. But no information on how the lifetime parameters relate to each other is available in this signature. The user of this generic function could use any lifetimes. And that's a problem inside its body. We're trying to return either x
or y
. But the type of x
is &'a T
. That's not compatible with the return type &'c T
. The same is true for y
. Since the compiler knows nothing about how these lifetimes relate to each other, it's not safe to return these references as a reference of type &'c T
.
Can it ever be safe to go from a value of type &'a T
to &'c T
? Yes. It's safe if the lifetime 'a
is equal or greater than the lifetime 'c
. Or in other words 'a: 'c
. So, we could write this
fn min<'a, 'b, 'c, T: Ord>(x: &'a T, y: &'b T) -> &'c T where 'a: 'c, 'b: 'c { … }
and get away with it without the compiler complaining about the function's body. But it's actually unnecessarily complex. We can also simply write
fn min<'a, T: Ord>(x: &'a T, y: &'a T) -> &'a T { … }
and use a single lifetime parameter for everything. The compiler is able to deduce 'a
as the minimum lifetime of the argument references at the call site just because we used the same lifetime name for both parameters. And this lifetime is precisely what we need for the return type.
I hope this answers your question. :) Cheers!
What does the annotation <'a> after the function name mean?
fn substr<'a>(s: &'a str, until: u32) -> &'a str; // ^^^^
This is declaring a generic lifetime parameter. It's similar to a generic type parameter (often seen as <T>
), in that the caller of the function gets to decide what the lifetime is. Like you said, the lifetime of the result will be the same as the lifetime of the first argument.
All lifetime names are equivalent, except for one: 'static
. This lifetime is pre-set to mean "guaranteed to live for the entire life of the program".
The most common lifetime parameter name is probably 'a
, but you can use any letter or string. Single letters are most common, but any snake_case
identifier is acceptable.
Why does the compiler need it, and what does it do with it?
Rust generally favors things to be explicit, unless there's a very good ergonomic benefit. For lifetimes, lifetime elision takes care of something like 85+% of cases, which seemed like a clear win.
Type parameters live in the same namespace as other types — is T
a generic type or did someone name a struct that? Thus type parameters need to have an explicit annotation that shows that T
is a parameter and not a real type. However, lifetime parameters don't have this same problem, so that's not the reason.
Instead, the main benefit of explicitly listing type parameters is because you can control how multiple parameters interact. A nonsense example:
fn better_str<'a, 'b, 'c>(a: &'a str, b: &'b str) -> &'c str where 'a: 'c, 'b: 'c, { if a.len() < b.len() { a } else { b } }
We have two strings and say that the input strings may have different lifetimes, but must both outlive the lifetime of the result value.
Another example, as pointed out by DK, is that structs can have their own lifetimes. I made this example also a bit of nonsense, but it hopefully conveys the point:
struct Player<'a> { name: &'a str, } fn name<'p, 'n>(player: &'p Player<'n>) -> &'n str { player.name }
Lifetimes can be one of the more mind-bending parts of Rust, but they are pretty great when you start to grasp them.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With