Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I group consecutive integers in a vector in Rust?

Tags:

rust

grouping

I have a Vec<i64> and I want to know all the groups of integers that are consecutive. As an example:

let v = vec![1, 2, 3, 5, 6, 7, 9, 10];

I'm expecting something like this or similar:

[[1, 2, 3], [5, 6, 7], [9, 10]];

The view (vector of vectors or maybe tuples or something else) really doesn't matter, but I should get several grouped lists with continuous numbers.

At the first look, it seems like I'll need to use itertools and the group_by function, but I have no idea how...

like image 491
Sergey Yakovlev Avatar asked May 16 '18 21:05

Sergey Yakovlev


3 Answers

You can indeed use group_by for this, but you might not really want to. Here's what I would probably write instead:

fn consecutive_slices(data: &[i64]) -> Vec<&[i64]> {
    let mut slice_start = 0;
    let mut result = Vec::new();
    for i in 1..data.len() {
        if data[i - 1] + 1 != data[i] {
            result.push(&data[slice_start..i]);
            slice_start = i;
        }
    }
    if data.len() > 0 {
        result.push(&data[slice_start..]);
    }
    result
}

This is similar in principle to eXodiquas' answer, but instead of accumulating a Vec<Vec<i64>>, I use the indices to accumulate a Vec of slice references that refer to the original data. (This question explains why I made consecutive_slices take &[T].)

It's also possible to do the same thing without allocating a Vec, by returning an iterator; however, I like the above version better. Here's the zero-allocation version I came up with:

fn consecutive_slices(data: &[i64]) -> impl Iterator<Item = &[i64]> {
    let mut slice_start = 0;
    (1..=data.len()).flat_map(move |i| {
        if i == data.len() || data[i - 1] + 1 != data[i] {
            let begin = slice_start;
            slice_start = i;
            Some(&data[begin..i])
        } else {
            None
        }
    })
}

It's not as readable as a for loop, but it doesn't need to allocate a Vec for the return value, so this version is more flexible.


Here's a "more functional" version using group_by:

use itertools::Itertools;

fn consecutive_slices(data: &[i64]) -> Vec<Vec<i64>> {
    (&(0..data.len()).group_by(|&i| data[i] as usize - i))
        .into_iter()
        .map(|(_, group)| group.map(|i| data[i]).collect())
        .collect()
}

The idea is to make a key function for group_by that takes the difference between each element and its index in the slice. Consecutive elements will have the same key because indices increase by 1 each time. One reason I don't like this version is that it's quite difficult to get slices of the original data structure; you almost have to create a Vec<Vec<i64>> (hence the two collects). The other reason is that I find it harder to read.

However, when I first wrote my preferred version (the first one, with the for loop), it had a bug (now fixed), while the other two versions were correct from the start. So there may be merit to writing denser code with functional abstractions, even if there is some hit to readability and/or performance.

like image 179
trent Avatar answered Oct 23 '22 14:10

trent


let v = vec![1, 2, 3, 5, 6, 7, 9, 10];
let mut res = Vec::new();
let mut prev = v[0];
let mut sub_v = Vec::new();

sub_v.push(prev);

for i in 1..v.len() {
    if v[i] == prev + 1 {
        sub_v.push(v[i]);
        prev = v[i];
    } else {
        res.push(sub_v.clone());
        sub_v.clear();
        sub_v.push(v[i]);
        prev = v[i];
    }
}

res.push(sub_v);

This should solve your problem.

Iterating over the given vector, checking if the current i64 (in my case i32) is +1 to the previous i64, if so push it into a vector (sub_v). After the series breaks, push the sub_v into the result vector. Repeat.

But I guess you wanted something functional?

like image 22
eXodiquas Avatar answered Oct 23 '22 15:10

eXodiquas


Another possible solution, that uses std only, could be:

fn consecutive_slices(v: &[i64]) -> Vec<Vec<i64>> {
    let t: Vec<Vec<i64>> = v
        .into_iter()
        .chain([*v.last().unwrap_or(&-1)].iter())
        .scan(Vec::new(), |s, &e| {
            match s.last() {
                None => { s.push(e); Some((false, Vec::new())) },
                Some(&p) if p == e - 1 => { s.push(e); Some((false, Vec::new()))},
                Some(&p) if p != e - 1 => {let o = s.clone(); *s = vec![e]; Some((true, o))},
                _ => None,
            }
         })
         .filter_map(|(n, v)| {
             match n {
                 true => Some(v.clone()),
                 false => None,
             }
         })
         .collect();

    t
}

The chain is used to get the last vector.

like image 37
iclac Avatar answered Oct 23 '22 14:10

iclac