Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to split string into chunks in Rust to insert spaces

I'm attempting to learn Rust. And a recent problem I've encountered is the following: given a String, that is exactly some multiple of n, I want to split the string into chunks of size n, and insert a space in between these chunks, then collect back into a single string.

The issue I was running into, is that the chars() method returns the Chars struct, which for some reason doesn't implement the SliceConcatExt trait, so chunks() can't be called on it.

Furthermore, once I've successfully created a Chunks struct (by calling .bytes() instead) I'm unsure how to call a .join(' ') since the elements are now Chunks of byte slices...

There has to be an elegant way to do this I'm missing.

For example here is an input / output that illustrates the situation:

given: whatupmyname, 4
output: what upmy name

This is my poorly written attempt:

let n = 4;
let text = "whatupmyname".into_string();
text.chars()
    // compiler error on chunks() call
    .chunks(n)
    .collect::<Vec<String>>()
    .join(' ')

Thank you for any help!

like image 409
Zeke Avatar asked Jul 14 '19 18:07

Zeke


People also ask

What does split do in Rust?

With the split() method, we can create an iterator containing the particular string's sub-strings. This method takes the separator as an argument. This separator determines where to start splitting the strings to an iterator of its sub-strings.

How many bytes is a string in Rust?

A String is always 24 bytes.

What is str in Rust?

The str type, also called a 'string slice', is the most primitive string type. It is usually seen in its borrowed form, &str . It is also the type of string literals, &'static str . String slices are always valid UTF-8.


2 Answers

The problem here is that chars() and bytes() return Iterators, not slices. You could use as_bytes(), which will give you a &[u8]. However, you cannot directly get a &[char] from a &str, because there only exists the bytes themselves, and the chars must be created by looking through and seeing how many bytes makes up each one. You'd have to do something like this:

text.chars()
    .collect::<Vec<char>>()
    .chunks(n)
    .map(|c| c.iter().collect::<String>())
    .collect::<Vec<String>>()
    .join(" ");

However, I would NOT recommend this as it has to allocate a lot of temporary storage for Vecs and Strings along the way. Instead, you could do something like this, which only has to allocate to create the final String.

text.chars()
    .enumerate()
    .flat_map(|(i, c)| {
        if i != 0 && i % n == 0 {
            Some(' ')
        } else {
            None
        }
        .into_iter()
        .chain(std::iter::once(c))
    })
    .collect::<String>()

This stays as iterators until the last collect, by flat_mapping with an iterator that is either just the character or a space and then the character.

like image 80
JayDepp Avatar answered Sep 21 '22 14:09

JayDepp


If the size of the data you want to split in is fixed then:

use std::str;

fn main() {
    let subs = "&#8204;&#8203;&#8204;&#8203;&#8204;&#8203;&#8203;&#8204;&#8203;&#8204;".as_bytes()
        .chunks(7)
        .map(str::from_utf8)
        .collect::<Result<Vec<&str>, _>>()
        .unwrap();
        
    println!("{:?}", subs);
}

// >> ["&#8204;", "&#8203;", "&#8204;", "&#8203;", "&#8204;", "&#8203;", "&#8203;", "&#8204;", "&#8203;", "&#8204;"]
like image 38
Esteban Borai Avatar answered Sep 20 '22 14:09

Esteban Borai