I'm working on Project Euler 96 to teach myself Rust. I've written this code to read in the file and convert it into a vector of integers (Playground).
let file = File::open(&args[1]).expect("Sudoku file not found");
let reader = BufReader::new(file);
let x = reader
.lines()
.map(|x| x.unwrap())
.filter(|x| !x.starts_with("Grid"))
.flat_map(|s| s.chars().collect::<Vec<_>>()) // <-- collect here!
.map(|x| x.to_digit(10).unwrap())
.collect::<Vec<_>>();
This all works fine but I'm puzzled why I have to collect into a vector in my flat_map
(I'm assuming creating unneeded vectors which will be immediately destroyed is inefficient). If I don't collect, it doesn't compile:
error[E0515]: cannot return value referencing function parameter `s`
--> src/main.rs:13:23
|
13 | .flat_map(|s| s.chars())
| -^^^^^^^^
| |
| returns a value referencing data owned by the current function
| `s` is borrowed here
The example from the docs shows almost the same code, but a collect is not required:
let words = ["alpha", "beta", "gamma"];
// chars() returns an iterator
let merged: String = words.iter()
.flat_map(|s| s.chars())
.collect();
assert_eq!(merged, "alphabetagamma");
So why is it different in my code?
The iterator reader.lines().map(|x| x.unwrap())
iterates over String
items, i.e. by value. Consequently, in .flat_map(|s| ...)
, the variable s
has the type String
(i.e. owned, not borrowed). In other words: the string is a local variable now and lives in the function. It's a simple rule that you cannot return references to local variables (see this Q&A). But that's exactly what s.chars()
does, even if it's a bit hidden.
Taking a look at str::chars
:
pub fn chars(&self) -> Chars<'_>
One can see that the string is borrowed. The returned Chars
object contains a reference to the original string. That's why we cannot return s.chars()
from the closure.
So why is it different in my code?
In the documentation's example, the iterator words.iter()
actually iterates over items of the type &&'static str
. Calling s.chars()
will also return a Chars
object that borrows some string, but that string's lifetime is 'static
(lives forever), so there is no problem with returning Chars
from that closure.
It would be great if the standard library had an OwnedChars
iterator that consumes a String
, works like Chars
and drops the string once the iterator is dropped. In that case it's fine to call s.owned_chars()
because the returned object does not reference the local s
, but owns it. But: such an owned iterator does not exist in the standard library!
I'm assuming creating unneeded vectors which will be immediately destroyed is inefficient
Yes, that is true in a way. But you might have missed that the reader.lines()
iterator also creates temporary objects of type String
. Those are more or less immediately destroyed as well! So even without the collect
in the flat_map
you have a bunch of unnecessary allocations. Note that sometimes that's Ok. In this case, I guess that input parsing is very fast in comparison to the actual algorithm you have to implement. So ... just collect
? It's probably fine in this case.
If you want to have high performance input parsing, I think you will not be able to avoid a standard loop, in particular in order to avoid unnecessary String
allocations. (Playground)
let mut line = String::new();
let mut input = Vec::new();
loop {
line.clear(); // clear contents, but keep memory buffer
// TODO: handle IO error properly
let bytes_read = reader.read_line(&mut line).expect("IO error");
if bytes_read == 0 {
break;
}
if line.starts_with("Grid") {
continue;
}
// TODO: handle invalid input error
input.extend(line.trim().chars().map(|c| c.to_digit(10).unwrap()));
}
In addition to the other answer, note that an owned iterator is easy to write:
struct OwnedChars {
s: String,
i: usize,
}
impl Iterator for OwnedChars {
type Item = char;
fn next(&mut self) -> Option<Self::Item> {
let c = self.s[self.i..].chars().next()?;
self.i += c.len_utf8();
Some(c)
}
}
fn into_iter(string: String) -> OwnedChars {
OwnedChars {
s: string,
i: 0,
}
}
fn main() {
let owned_iter = into_iter("Zażółć gęślą jaźń".into());
for c in owned_iter {
println!("{}", c);
}
}
And then, you don't need to collect:
fn main() {
use std::io::prelude::*;
let file = std::fs::File::open(std::env::args().nth(1).unwrap()).expect("Sudoku file not found");
let reader = std::io::BufReader::new(file);
let x = reader
.lines()
.map(|x| x.unwrap())
.filter(|x| !x.starts_with("Grid"))
.flat_map(into_iter)
.map(|x| x.to_digit(10).unwrap())
.collect::<Vec<_>>();
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With