Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I make my Rust function more generic & efficient?

I have a function that works, but is more specialized than I'd like and has some inefficiencies, I'd like to fix that.

The working but flawed function:

fn iter_to_min<T>(i:T) -> i64 where T:Iterator<Item=String>{
    i.collect::<Vec<String>>()
        .iter()
        .flat_map(|s|s.split_whitespace())
        .map(str::trim)
        .map(str::parse::<i64>)
        .map(Result::unwrap)
        .min()
        .expect("No min found.")
}

The reasons I dislike this implementation are:

  • i64 is hard coded and I'd like to reuse this function for u64 and possibly other return types
  • it collects it's input just to iterate over it immediately which is inefficient (heap allocation for no reason)
  • the closure passed to flat_map might not be optimized away by LLVM in all cases

and the closest I could get to my ideal function:

use std::str::FromStr;

fn iter_to_min<T,U>(i:T) -> U where T:Iterator<Item=String>,U: Ord+FromStr{
    i.flat_map(str::split_whitespace)
        .map(str::trim)
        .map(str::parse::<U>)
        .map(Result::unwrap)
        .min()
        .expect("No min found.")
}

The problems I am seeing are:

  • the argument passed to str::split_whitespace is a String and won't coerce to a str
  • the argument passed to str::split_whitespace isn't known to live long enough
  • Result::unwrap doesn't complains that the trait core::fmt::Debug is not implemented for the type <U as core::str::FromStr>::Err

I think that with clever lifetime notation and Trait requirements at least two of those could be fixed, and who knows maybe there's a way to go three for three.

Example code using some suggested fixes:

use std::io::BufRead;
use std::str::FromStr;
use std::fmt::Debug;

fn iter_to_min<T,U>(i:T) -> U where T:Iterator<Item=String>,U: Ord+FromStr, U::Err: Debug{
    i.collect::<Vec<String>>()
        .iter()
        .flat_map(|s|s.split_whitespace())
        .map(str::trim)
        .map(str::parse::<U>)
        .map(Result::unwrap)
        .min()
        .expect("No min found.")
}

fn main() {
    let a: Vec<_> = std::env::args().skip(1).collect();
    let m:i64 = if a.is_empty() {
        let s = std::io::stdin();
        let m = iter_to_min(s.lock().lines().map(Result::unwrap));
        m
    }else{
        iter_to_min(a.into_iter())
    };
    println!("{}", m);
}
like image 244
Camden Narzt Avatar asked May 26 '16 08:05

Camden Narzt


People also ask

Does Rust support generic types?

Rust can make use of generics in several places: Function Definitions. Struct Definitions. Enum Definitions.

What is FN in Rust?

Keyword fnA function or function pointer. Functions are the primary way code is executed within Rust. Function blocks, usually just called functions, can be defined in a variety of different places and be assigned many different attributes and modifiers.

Are Rust generics templates?

Rust has a concept similar to templates called generics. A generics is a struct or trait that takes type parameters just like a template. However but the type can be enforced by saying the traits that it must implement. In addition any errors are meaningful.

What is generic data type?

Generics means parameterized types. The idea is to allow type (Integer, String, … etc., and user-defined types) to be a parameter to methods, classes, and interfaces. Using Generics, it is possible to create classes that work with different data types.


2 Answers

Unfortunately, there is no way to do what you want when keeping both the genericity and the absence of allocations. The reason is that you need someone to own your string data, but if flat_map(str::split_whitespace) is performed over an iterator of owned strings, then there is no one to keep these owned strings anymore because str::split_whitespace only borrows the string it is called on. However, if you "push" the ownership up the call chain, you won't be able to be fully generic and accept owned strings by value.

Therefore, there are two solutions: to precollect the entire iterator to a Vec<String> (or to transform the items yielded by split_whitespace() to owned strings separately and collect them to a Vec again), or to accept an iterator of references.

Here is the most generic version I could come up with of the first solution:

use std::str::FromStr;
use std::fmt::Debug;

fn iter_to_min<S, T, U>(i: T) -> U
    where S: Into<String>,
          T: IntoIterator<Item=S>,
          U: Ord + FromStr,
          U::Err: Debug
{
    i.into_iter()
        .map(Into::into)
        .collect::<Vec<_>>()
        .iter()
        .flat_map(|s| s.split_whitespace())
        .map(str::parse::<U>)
        .map(Result::unwrap)
        .min()
        .expect("No min found")
}

(try it here)

It is basically the same as your first one but more generic. Also note that you don't need to trim the parts of the string after split_whitespace() - the latter will make sure that the parts of the string do not have spaces at their sides. Into<String> bound allows one to pass both &str and String iterators, and in the latter case no extra copies will be done.

Alternatively, you can split each line into owned strings separately:

fn iter_to_min<S, T, U>(i: T) -> U
    where S: AsRef<str>,
          T: IntoIterator<Item=S>,
          U: Ord + FromStr,
          U::Err: Debug
{
    i.into_iter()
        .flat_map(|s| s.as_ref().split_whitespace().map(String::from).collect::<Vec<_>>())
        .map(|s| s.parse::<U>())
        .map(Result::unwrap)
        .min()
        .expect("No min found")
}

Here we only need to obtain a &strs from iterator items, not Strings, so I used AsRef<str>. However, each line must not only be converted to a sequence of Strings; this sequence must be collected into a vector for exactly the same reason described above - otherwise there would be no one to keep the original values of type S from destruction.

But it is possible to avoid the .map(String::from).collect::<Vec<_>>() if you are willing to lose some genericity, though. This is the second solution I've mentioned above. We can accept an iterator over references:

fn iter_to_min<'a, S: ?Sized, T, U>(i: T) -> U
    where S: AsRef<str> + 'a,
          T: IntoIterator<Item=&'a S>,
          U: Ord + FromStr,
          U::Err: Debug
{
    i.into_iter()
        .map(AsRef::as_ref)
        .flat_map(str::split_whitespace)
        .map(|s| s.parse::<U>())
        .map(Result::unwrap)
        .min()
        .expect("No min found")
}

(try it here)

Roughly speaking, now S values are owned by someone else, and their lifetime is bigger than the scope of iter_to_min(), so you need neither to transform each part to String nor to collect the entire split result to a Vec<String>. However, you won't be able to pass a Vec<String> to this function; you will be able to pass vec.iter(), however:

let v: Vec<String> = vec!["0".into(), "1".into()];
iter_to_min(v.iter())

In all these examples I've changed Iterator to IntoIterator - this is almost always what you want to use instead of just Iterator. It allows you, for example, to pass collections to such functions directly. Second, I've added U::Err: Debug condition, which is necessary for Result::unwrap to work. And finally, to fix the problem with "String not coercing to &str` you can always use explicit closures and method syntax which would do this coercion for you.

like image 186
Vladimir Matveev Avatar answered Sep 28 '22 09:09

Vladimir Matveev


A solution without extra allocations

use std::str::FromStr;

fn iter_to_min<T, U>(i: T) -> Option<U>
    where T: Iterator<Item = String>,
          U: Ord + FromStr
{
    i.filter_map(|string| {
            string.split_whitespace()
                .map(str::trim)
                .map(str::parse::<U>)
                .filter_map(Result::ok)
                .min()
        })
        .min()
}
like image 32
A.B. Avatar answered Sep 28 '22 09:09

A.B.