Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Iterators, laziness, and ownership

Tags:

rust

I'm getting started with Rust, and I'm playing around with the regex crate so that I can create a lexer.

The lexer uses a big regular expression that contains a bunch of named capture groups. I'm trying to take the results of my regular expression and create a Vec<&str, &str> of the capture name and capture value, but I keep running into issue regarding the lifetimes of values returned from iterating when mapping and filtering over the results.

I think this has something to do with laziness and the fact that the iterator has not been consumed when falling out of scope, but I'm not sure how to actually solve the problem.

extern crate regex;

use regex::Regex;

fn main() {
    // Define a regular expression with a bunch of named capture groups
    let expr = "((?P<num>[0-9]+)|(?P<str>[a-zA-Z]+))";
    let text = "0ab123cd";
    let re = Regex::new(&expr).unwrap();

    let tokens: Vec<(&str, &str)> = re.captures_iter(text)
        .flat_map(|t| t.iter_named())
        .filter(|t| t.1.is_some())
        .map(|t| (t.0, t.1.unwrap()))
        .collect();

    for token in tokens {
        println!("{:?}", token);
    }
}

Running the above code results in the following error:

$ cargo run
Compiling hello_world v0.0.1 (file:///Users/dowling/projects/rust_hello_world)

src/main.rs:14:23: 14:24 error: `t` does not live long enough
src/main.rs:14         .flat_map(|t| t.iter_named())
                                     ^
src/main.rs:17:19: 22:2 note: reference must be valid for the block suffix following statement 3 at 17:18...
src/main.rs:17         .collect();
src/main.rs:18 
src/main.rs:19     for token in tokens {
src/main.rs:20         println!("{:?}", token);
src/main.rs:21     }
src/main.rs:22 }
src/main.rs:14:23: 14:37 note: ...but borrowed value is only valid for the block at 14:22
src/main.rs:14         .flat_map(|t| t.iter_named())
                                     ^~~~~~~~~~~~~~
error: aborting due to previous error
Could not compile `hello_world`.
like image 410
Michael Dowling Avatar asked Apr 06 '15 07:04

Michael Dowling


1 Answers

The limiting point in your situation is the .iter_named() method:

fn iter_named(&'t self) -> SubCapturesNamed<'t>

Note the &'t self: the lifetime of the output will be tied to the lifetime of the Captures instance. This is because the names are stored in the Capture object, so any &str to them cannot outlive this object.

There is only one fix for that: you must keep the Capture instances alive:

let captures = re.captures_iter(text).collect::<Vec<_>>();
let tokens: Vec<(&str, &str)> = captures.iter()
    .flat_map(|t| t.iter_named())
    .filter(|t| t.1.is_some())
    .map(|t| (t.0, t.1.unwrap()))
    .collect();
like image 105
Levans Avatar answered Nov 15 '22 09:11

Levans