I believe that this function declaration tells Rust that the lifetime of the function's output is the same as the lifetime of it's <code>s</code> parameter: <pre class="prettyprint"><code>fn substr<'a>(s: &'a str, until: u32) -> &'a str; ^^^^ </code></pre> It seems to me that the compiler only needs to know this(1): <pre class="prettyprint"><code>fn substr(s: &'a str, until: u32) -> &'a str; </code></pre> What does the annotation <code><'a></code> after the function name mean? Why does the compiler need it, and what does it do with it? <hr> (1): I know it needs to know even less, due to lifetime elision. But this question is about specifying lifetime explicitly.

Let me expand on the previous answers… <blockquote> What does the annotation <'a> after the function name mean? </blockquote> I wouldn't use the word "annotation" for that. Much like <code><T></code> introduces a generic type parameter, <code><'a></code> introduces a generic lifetime parameter. You can't use any generic parameters without introducing them first and for generic functions this introduction happens right after their name. You can think of a generic function as a family of functions. So, essentially, you get one function for every combination of generic parameters. <code>substr::<'x></code> would be a specific member of that function family for some lifetime <code>'x</code>. If you're unclear on when and why we have to be explicit about lifetimes, read on… A lifetime parameter is always associated with all reference types. When you write <pre class="prettyprint"><code>fn main() { let x = 28374; let r = &x; } </code></pre> the compiler knows that x lives in the main function's scope enclosed with curly braces. Internally, it identifies this scope with some lifetime parameter. For us, it is unnamed. When you take the address of <code>x</code>, you'll get a value of a specific reference type. A reference type is kind of a member of a two dimensional family of reference types. One axis is the type of what the reference points to and the other axis is a lifetime that is used for two constraints: <ol> <li>The lifetime parameter of a reference type represents an upper bound for how long you can hold on to that reference</li> <li>The lifetime parameter of a reference type represents a lower bound for the lifetime of the things you can make the reference point to.</li> </ol> Together, these constraints play a vital role in Rust's memory safety story. The goal here is to avoid dangling references. We would like to rule out references that point to some memory region we are not allowed to use anymore because that thing it used to point to does not exist anymore. One potential source of confusion is probably the fact that lifetime parameters are invisible most of the time. But that does not mean they are not there. References always have a lifetime parameter in their type. But such a lifetime parameter does not have to have a name and most of the time we don't need to mention it anyways because the compiler can assign names for lifetime parameters automatically. This is called "lifetime elision". For example, in the following case, you don't see any lifetime parameters being mentioned: <pre class="prettyprint"><code>fn substr(s: &str, until: u32) -> &str {…} </code></pre> But it's okay to write it like this. It's actually a short-cut syntax for the more explicit <pre class="prettyprint"><code>fn substr<'a>(s: &'a str, until: u32) -> &'a str {…} </code></pre> Here, the compiler automatically assigns the same name to the "input lifetime" and the "output lifetime" because it's a very common pattern and most likely exactly what you want. Because this pattern is so common, the compiler lets us get away without saying anything about lifetimes. It assumes that this more explicit form is what we meant based on a couple of "lifetime elision" rules (which are at least documented here) There are situations in which explicit lifetime parameters are not optional. For example, if you write <pre class="prettyprint"><code>fn min<T: Ord>(x: &T, y: &T) -> &T { if x <= y { x } else { y } } </code></pre> the compiler will complain because it will interpret the above declaration as <pre class="prettyprint"><code>fn min<'a, 'b, 'c, T: Ord>(x: &'a T, y: &'b T) -> &'c T { … } </code></pre> So, for each reference a separate lifetime parameter is introduced. But no information on how the lifetime parameters relate to each other is available in this signature. The user of this generic function could use any lifetimes. And that's a problem inside its body. We're trying to return either <code>x</code> or <code>y</code>. But the type of <code>x</code> is <code>&'a T</code>. That's not compatible with the return type <code>&'c T</code>. The same is true for <code>y</code>. Since the compiler knows nothing about how these lifetimes relate to each other, it's not safe to return these references as a reference of type <code>&'c T</code>. Can it ever be safe to go from a value of type <code>&'a T</code> to <code>&'c T</code>? Yes. It's safe if the lifetime <code>'a</code> is equal or greater than the lifetime <code>'c</code>. Or in other words <code>'a: 'c</code>. So, we could write this <pre class="prettyprint"><code>fn min<'a, 'b, 'c, T: Ord>(x: &'a T, y: &'b T) -> &'c T where 'a: 'c, 'b: 'c { … } </code></pre> and get away with it without the compiler complaining about the function's body. But it's actually unnecessarily complex. We can also simply write <pre class="prettyprint"><code>fn min<'a, T: Ord>(x: &'a T, y: &'a T) -> &'a T { … } </code></pre> and use a single lifetime parameter for everything. The compiler is able to deduce <code>'a</code> as the minimum lifetime of the argument references at the call site just because we used the same lifetime name for both parameters. And this lifetime is precisely what we need for the return type. I hope this answers your question. :) Cheers!

<blockquote> What does the annotation <'a> after the function name mean? </blockquote> <pre class="prettyprint"><code>fn substr<'a>(s: &'a str, until: u32) -> &'a str; // ^^^^ </code></pre> This is declaring a generic lifetime parameter. It's similar to a generic type parameter (often seen as <code><T></code>), in that the caller of the function gets to decide what the lifetime is. Like you said, the lifetime of the result will be the same as the lifetime of the first argument. All lifetime names are equivalent, except for one: <code>'static</code>. This lifetime is pre-set to mean "guaranteed to live for the entire life of the program". The most common lifetime parameter name is probably <code>'a</code>, but you can use any letter or string. Single letters are most common, but any <code>snake_case</code> identifier is acceptable. <blockquote> Why does the compiler need it, and what does it do with it? </blockquote> Rust generally favors things to be explicit, unless there's a very good ergonomic benefit. For lifetimes, lifetime elision takes care of something like 85+% of cases, which seemed like a clear win. Type parameters live in the same namespace as other types — is <code>T</code> a generic type or did someone name a struct that? Thus type parameters need to have an explicit annotation that shows that <code>T</code> is a parameter and not a real type. However, lifetime parameters don't have this same problem, so that's not the reason. Instead, the main benefit of explicitly listing type parameters is because you can control how multiple parameters interact. A nonsense example: <pre class="prettyprint"><code>fn better_str<'a, 'b, 'c>(a: &'a str, b: &'b str) -> &'c str where 'a: 'c, 'b: 'c, { if a.len() < b.len() { a } else { b } } </code></pre> We have two strings and say that the input strings may have different lifetimes, but must both outlive the lifetime of the result value. Another example, as pointed out by DK, is that structs can have their own lifetimes. I made this example also a bit of nonsense, but it hopefully conveys the point: <pre class="prettyprint"><code>struct Player<'a> { name: &'a str, } fn name<'p, 'n>(player: &'p Player<'n>) -> &'n str { player.name } </code></pre> Lifetimes can be one of the more mind-bending parts of Rust, but they are pretty great when you start to grasp them.

Why does the lifetime name appear as part of the function type?

Tags:

I believe that this function declaration tells Rust that the lifetime of the function's output is the same as the lifetime of it's s parameter:

fn substr<'a>(s: &'a str, until: u32) -> &'a str;          ^^^^

It seems to me that the compiler only needs to know this(1):

fn substr(s: &'a str, until: u32) -> &'a str;

What does the annotation <'a> after the function name mean? Why does the compiler need it, and what does it do with it?

(1): I know it needs to know even less, due to lifetime elision. But this question is about specifying lifetime explicitly.

890

asked Apr 24 '15 01:04

Wayne Conrad

2 Answers

Let me expand on the previous answers…

What does the annotation <'a> after the function name mean?

I wouldn't use the word "annotation" for that. Much like <T> introduces a generic type parameter, <'a> introduces a generic lifetime parameter. You can't use any generic parameters without introducing them first and for generic functions this introduction happens right after their name. You can think of a generic function as a family of functions. So, essentially, you get one function for every combination of generic parameters. substr::<'x> would be a specific member of that function family for some lifetime 'x.

If you're unclear on when and why we have to be explicit about lifetimes, read on…

A lifetime parameter is always associated with all reference types. When you write

fn main() {     let x = 28374;     let r = &x; }

the compiler knows that x lives in the main function's scope enclosed with curly braces. Internally, it identifies this scope with some lifetime parameter. For us, it is unnamed. When you take the address of x, you'll get a value of a specific reference type. A reference type is kind of a member of a two dimensional family of reference types. One axis is the type of what the reference points to and the other axis is a lifetime that is used for two constraints:

The lifetime parameter of a reference type represents an upper bound for how long you can hold on to that reference
The lifetime parameter of a reference type represents a lower bound for the lifetime of the things you can make the reference point to.

Together, these constraints play a vital role in Rust's memory safety story. The goal here is to avoid dangling references. We would like to rule out references that point to some memory region we are not allowed to use anymore because that thing it used to point to does not exist anymore.

One potential source of confusion is probably the fact that lifetime parameters are invisible most of the time. But that does not mean they are not there. References always have a lifetime parameter in their type. But such a lifetime parameter does not have to have a name and most of the time we don't need to mention it anyways because the compiler can assign names for lifetime parameters automatically. This is called "lifetime elision". For example, in the following case, you don't see any lifetime parameters being mentioned:

fn substr(s: &str, until: u32) -> &str {…}

But it's okay to write it like this. It's actually a short-cut syntax for the more explicit

fn substr<'a>(s: &'a str, until: u32) -> &'a str {…}

Here, the compiler automatically assigns the same name to the "input lifetime" and the "output lifetime" because it's a very common pattern and most likely exactly what you want. Because this pattern is so common, the compiler lets us get away without saying anything about lifetimes. It assumes that this more explicit form is what we meant based on a couple of "lifetime elision" rules (which are at least documented here)

There are situations in which explicit lifetime parameters are not optional. For example, if you write

fn min<T: Ord>(x: &T, y: &T) -> &T {     if x <= y {         x     } else {         y     } }

the compiler will complain because it will interpret the above declaration as

fn min<'a, 'b, 'c, T: Ord>(x: &'a T, y: &'b T) -> &'c T { … }

So, for each reference a separate lifetime parameter is introduced. But no information on how the lifetime parameters relate to each other is available in this signature. The user of this generic function could use any lifetimes. And that's a problem inside its body. We're trying to return either x or y. But the type of x is &'a T. That's not compatible with the return type &'c T. The same is true for y. Since the compiler knows nothing about how these lifetimes relate to each other, it's not safe to return these references as a reference of type &'c T.

Can it ever be safe to go from a value of type &'a T to &'c T? Yes. It's safe if the lifetime 'a is equal or greater than the lifetime 'c. Or in other words 'a: 'c. So, we could write this

fn min<'a, 'b, 'c, T: Ord>(x: &'a T, y: &'b T) -> &'c T       where 'a: 'c, 'b: 'c { … }

and get away with it without the compiler complaining about the function's body. But it's actually unnecessarily complex. We can also simply write

fn min<'a, T: Ord>(x: &'a T, y: &'a T) -> &'a T { … }

and use a single lifetime parameter for everything. The compiler is able to deduce 'a as the minimum lifetime of the argument references at the call site just because we used the same lifetime name for both parameters. And this lifetime is precisely what we need for the return type.

I hope this answers your question. :) Cheers!

192

answered Dec 06 '22 12:12

sellibitze

What does the annotation <'a> after the function name mean?

fn substr<'a>(s: &'a str, until: u32) -> &'a str; //       ^^^^

This is declaring a generic lifetime parameter. It's similar to a generic type parameter (often seen as <T>), in that the caller of the function gets to decide what the lifetime is. Like you said, the lifetime of the result will be the same as the lifetime of the first argument.

All lifetime names are equivalent, except for one: 'static. This lifetime is pre-set to mean "guaranteed to live for the entire life of the program".

The most common lifetime parameter name is probably 'a, but you can use any letter or string. Single letters are most common, but any snake_case identifier is acceptable.

Why does the compiler need it, and what does it do with it?

Rust generally favors things to be explicit, unless there's a very good ergonomic benefit. For lifetimes, lifetime elision takes care of something like 85+% of cases, which seemed like a clear win.

Type parameters live in the same namespace as other types — is T a generic type or did someone name a struct that? Thus type parameters need to have an explicit annotation that shows that T is a parameter and not a real type. However, lifetime parameters don't have this same problem, so that's not the reason.

Instead, the main benefit of explicitly listing type parameters is because you can control how multiple parameters interact. A nonsense example:

fn better_str<'a, 'b, 'c>(a: &'a str, b: &'b str) -> &'c str where     'a: 'c,     'b: 'c, {     if a.len() < b.len() {         a     } else {         b     } }

We have two strings and say that the input strings may have different lifetimes, but must both outlive the lifetime of the result value.

Another example, as pointed out by DK, is that structs can have their own lifetimes. I made this example also a bit of nonsense, but it hopefully conveys the point:

struct Player<'a> {     name: &'a str, }  fn name<'p, 'n>(player: &'p Player<'n>) -> &'n str {     player.name }

Lifetimes can be one of the more mind-bending parts of Rust, but they are pretty great when you start to grasp them.

answered Dec 06 '22 12:12

Shepmaster

Related questions
                            
                                How to use a WTForms FieldList of FormFields?
                            
                                PHP flat array to nested ["a", "b", "c"] to ["a" =>["b"=>["c"]]] [closed]
                            
                                laravel Change Database connection run time
                            
                                What is the difference between ArrayBuffer and Array
                            
                                How to create guid in PostgreSQL
                            
                                Bin values based on ranges with pandas [duplicate]
                            
                                Dart how to add commas to a string number
                            
                                How to use SASS/SCSS with Phoenix framework?
                            
                                Scikit-learn cross val score: too many indices for array
                            
                                How to get name of datatable column?
                            
                                Cannot use `document.execCommand('copy');` from developer console
                            
                                Why comma ' , ' and plus ' + ' log the console output in different pattern?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With