I'm getting started with Rust, and I'm playing around with the regex crate so that I can create a lexer.
The lexer uses a big regular expression that contains a bunch of named capture groups. I'm trying to take the results of my regular expression and create a Vec<&str, &str>
of the capture name and capture value, but I keep running into issue regarding the lifetimes of values returned from iterating when mapping and filtering over the results.
I think this has something to do with laziness and the fact that the iterator has not been consumed when falling out of scope, but I'm not sure how to actually solve the problem.
extern crate regex;
use regex::Regex;
fn main() {
// Define a regular expression with a bunch of named capture groups
let expr = "((?P<num>[0-9]+)|(?P<str>[a-zA-Z]+))";
let text = "0ab123cd";
let re = Regex::new(&expr).unwrap();
let tokens: Vec<(&str, &str)> = re.captures_iter(text)
.flat_map(|t| t.iter_named())
.filter(|t| t.1.is_some())
.map(|t| (t.0, t.1.unwrap()))
.collect();
for token in tokens {
println!("{:?}", token);
}
}
Running the above code results in the following error:
$ cargo run
Compiling hello_world v0.0.1 (file:///Users/dowling/projects/rust_hello_world)
src/main.rs:14:23: 14:24 error: `t` does not live long enough
src/main.rs:14 .flat_map(|t| t.iter_named())
^
src/main.rs:17:19: 22:2 note: reference must be valid for the block suffix following statement 3 at 17:18...
src/main.rs:17 .collect();
src/main.rs:18
src/main.rs:19 for token in tokens {
src/main.rs:20 println!("{:?}", token);
src/main.rs:21 }
src/main.rs:22 }
src/main.rs:14:23: 14:37 note: ...but borrowed value is only valid for the block at 14:22
src/main.rs:14 .flat_map(|t| t.iter_named())
^~~~~~~~~~~~~~
error: aborting due to previous error
Could not compile `hello_world`.
The limiting point in your situation is the .iter_named()
method:
fn iter_named(&'t self) -> SubCapturesNamed<'t>
Note the &'t self
: the lifetime of the output will be tied to the lifetime of the Captures
instance. This is because the names are stored in the Capture
object, so any &str
to them cannot outlive this object.
There is only one fix for that: you must keep the Capture
instances alive:
let captures = re.captures_iter(text).collect::<Vec<_>>();
let tokens: Vec<(&str, &str)> = captures.iter()
.flat_map(|t| t.iter_named())
.filter(|t| t.1.is_some())
.map(|t| (t.0, t.1.unwrap()))
.collect();
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With