I am working may way through the Rust book, and it has this code snippet:
fn first_word(s: &String) -> usize {
let bytes = s.as_bytes();
for (i, &item) in bytes.iter().enumerate() {
if item == b' ' {
return i;
}
}
s.len()
}
In the previous chapter, it was explained that
To me, it looks like bytes
is not referenced after the for
line header (presumably the code associated with bytes.iter().enumerate()
is executed just once before the loop starts, not on every loop iteration), so &item
shouldn't be allowed to be a reference into any part of bytes
. But I don't see any other object (is "object" the right rust terminology?) that it could be a reference into.
It's true that s
is still in scope, but, well... does the compiler even remember the connection between bytes
and s
by the time the for
loop rolls around? Indeed, even if I change the function to accept bytes
directly, the compiler thinks things are hunky-dory:
fn first_word(bytes: &[u8]) -> usize {
for (i, &item) in bytes.iter().enumerate() {
if item == b' ' {
return i+1;
}
}
0
}
Here no variables other than i
or item
is mentioned after the for
loop header, so it really seems like &item
can't be a reference into anything!
I looked at this very similarly-titled question. The comments and answers there suggest that there might be a lifetime argument which is keeping bytes
alive, and proposed a way to ask the compiler what it thinks the type/lifetime is. I haven't learned about lifetimes yet, so I'm fumbling about in the dark a little, but I tried:
fn first_word<'x_lifetime>(s: &String) -> usize {
let bytes = s.as_bytes();
let x_variable: &'x_lifetime () = bytes;
for (i, &item) in bytes.iter().enumerate() {
if item == b' ' {
return i+1;
}
}
0
}
However, the error, while indeed indicating that bytes
is a &[u8]
, which I understand, also seems to imply there's no extra lifetime information associated with bytes
. I say this because the error includes lifetime information in the "expected" part but not in the "found" part. Here it is:
error[E0308]: mismatched types
--> test.rs:3:39
|
3 | let x_variable: &'x_lifetime () = bytes;
| --------------- ^^^^^ expected `()`, found slice `[u8]`
| |
| expected due to this
|
= note: expected reference `&'x_lifetime ()`
found reference `&[u8]`
So what is going on here? Obviously some part of my reasoning is off, but what? Why isn't item
a dangling reference?
Here is a version of the first function with all the lifetimes and types included, and the for
loop replaced with an equivalent while let
loop.
fn first_word<'a>(s: &'a String) -> usize {
let bytes: &'a [u8] = s.as_bytes();
let mut iter: std::iter::Enumerate<std::slice::Iter<'a, u8>>
= bytes.iter().enumerate();
while let Some(iter_item) = iter.next() {
let (i, &item): (usize, &'a u8) = iter_item;
if item == b' ' {
return i;
}
}
s.len()
}
Things to notice here:
bytes.iter().enumerate()
has a lifetime parameter which ensures the iterator does not outlive the [u8]
it iterates over. (Note that the &[u8]
can go away — it isn't needed. What matters is that its referent, the bytes inside the String
, stays alive. So, we only really need to think about one lifetime 'a
, not separate lifetimes for s
and bytes
, because there's only one byte-slice inside the String
that we're referring to in different ways.iter
— an explicit variable corresponding to the implicit action of for
— is used in every iteration of the loop.I see another misunderstanding, not exactly about the lifetimes and scope:
there might be a lifetime argument which is keeping
bytes
alive,
Lifetimes never keep something alive. Lifetimes never affect the execution of the program; they never affect when something is dropped or deallocated. A lifetime in some type such as &'a u8
is a compile-time claim that values of that type will be valid for that lifetime. Changing the lifetimes in a program changes only what is to be proven (checked) by the compiler, not what is true about the program.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With