Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a better functional way to process a vector with error checking?

I'm learning Rust and would like to know how I can improve the code below.

I have a vector of tuples of form (u32, String). The u32 values represent line numbers and the Strings are the text on the corresponding lines. As long as all the String values can be successfully parsed as integers, I want to return an Ok<Vec<i32>> containing the just parsed String values, but if not I want to return an error of some form (just an Err<String> in the example below).

I'm trying to learn to avoid mutability and use functional styles where appropriate, and the above is straightforward to do functionally if that was all that was needed. Here's what I came up with in this case:

fn data_vals(sv: &Vec<(u32, String)>) -> Result<Vec<i32>, String> {
    sv.iter()
        .map(|s| s.1.parse::<i32>()
                    .map_err(|_e| "*** Invalid data.".to_string()))
        .collect()
}

However, the small catch is that I want to print an error message for every invalid value (and not just the first one), and the error messages should contain both the line number and the string values in the offending tuple.

I've managed to do it with the following code:

fn data_vals(sv: &Vec<(u32, String)>) -> Result<Vec<i32>, String> {
    sv.iter()
        .map(|s| (s.0, s.1.parse::<i32>()
                  .or_else(|e| {
                      eprintln!("ERROR: Invalid data value at line {}:  '{}'",
                                s.0, s.1);
                      Err(e)
                  })))
        .collect::<Vec<(u32, Result<i32, _>)>>() // Collect here to avoid short-circuit
        .iter()
        .map(|i| i.1
             .clone()
             .map_err(|_e| "*** Invalid data.".to_string()))
        .collect()
}

This works, but seems rather messy and cumbersome - especially the typed collect() in the middle to avoid short-circuiting so all the errors are printed. The clone() call is also annoying, and I'm not really sure why it's needed - the compiler says I'm moving out of borrowed content otherwise, but I'm not really sure what's being moved. Is there a way it can be done more cleanly? Or should I go back to a more procedural style? When I tried, I ended up with mutable variables and a flag to indicate success and failure, which seems less elegant:

fn data_vals(sv: &Vec<(u32, String)>) -> Result<Vec<i32>, String> {
    let mut datavals = Vec::new();
    let mut success = true;
    for s in sv {
        match s.1.parse::<i32>() {
            Ok(v) => datavals.push(v),
            Err(_e) => {
                eprintln!("ERROR: Invalid data value at line {}:  '{}'",
                          s.0, s.1);
                success = false;
            },
        }
    }
    if success {
        return Ok(datavals);
    } else {
        return Err("*** Invalid data.".to_string());
    }
}

Can someone advise me on the best way to do this? Should I stick to the procedural style here, and if so can that be improved? Or is there a cleaner functional way to do it? Or a blend of the two? Any advice appreciated.

like image 853
Bob Avatar asked Jan 25 '19 22:01

Bob


2 Answers

I think that's what partition_map() from itertools is for:

use itertools::{Either, Itertools};

fn data_vals<'a>(sv: &[&'a str]) -> Result<Vec<i32>, Vec<(&'a str, std::num::ParseIntError)>> {
    let (successes, failures): (Vec<_>, Vec<_>) =
        sv.iter().partition_map(|s| match s.parse::<i32>() {
            Ok(v) => Either::Left(v),
            Err(e) => Either::Right((*s, e)),
        });
    if failures.len() != 0 {
        Err(failures)
    } else {
        Ok(successes)
    }
}

fn main() {
    let numbers = vec!["42", "aaaezrgggtht", "..4rez41eza", "55"];
    println!("{:#?}", data_vals(&numbers));
}
like image 67
Stargateur Avatar answered Oct 12 '22 11:10

Stargateur


In a purely functional style, you have to avoid side-effects. Printing errors is a side-effect. The preferred style would be to return an object of the style:

Result<Vec<i32>, Vec<String>>

and print the list after the data_vals function returns.

So, essentially, you want your processing to collect a list of integers, and a list of strings:

fn data_vals(sv: &Vec<(u32, String)>) -> Result<Vec<i32>, Vec<String>> {
    let (ok, err): (Vec<_>, Vec<_>) = sv
        .iter()
        .map(|(i, s)| {
            s.parse()
                .map_err(|_e| format!("ERROR: Invalid data value at line {}: '{}'", i, s))
        })
        .partition(|e| e.is_ok());

    if err.len() > 0 {
        Err(err.iter().filter_map(|e| e.clone().err()).collect())
    } else {
        Ok(ok.iter().filter_map(|e| e.clone().ok()).collect())
    }
}

fn main() {
    let input = vec![(1, "0".to_string())];
    let r = data_vals(&input);
    assert_eq!(r, Ok(vec![0]));

    let input = vec![(1, "zzz".to_string())];
    let r = data_vals(&input);
    assert_eq!(r, Err(vec!["ERROR: Invalid data value at line 1: 'zzz'".to_string()]));
}

Playground Link

This uses partition which does not depend on an external crate.

like image 30
SirDarius Avatar answered Oct 12 '22 12:10

SirDarius