I want to write a function as follows:
What is a clean solution that will achieve this? My current solution is:
let mut swapped = input_str[0..i].to_string();
swapped.push(input_str.char_at(i));
swapped.push(input_str.char_at(i - 1));
swapped.push_str(&query[i..input_str.len()]);
But that only works for ASCII strings. I can think of other solutions as converting to a vector in UTF-32, swapping there and converting back to a string, but it seems like a lot of extra work.
Since String objects are immutable, going to a char[] via toCharArray , swapping the characters, then making a new String from char[] via the String(char[]) constructor would work.
As we know that Object of String in Java are immutable (i.e. we cannot perform any changes once its created). To do modifications on string stored in a String object, we copy it to a character array, StringBuffer, etc and do modifications on the copy object.
For swapping two strings from one location to another location, we use strcpy() function. An array of characters (or) collection of characters is called a string.
Here's a pretty solution:
use std::str::CharRange;
fn swap_chars_at(input_str: &str, i: usize) -> String {
// Pre-allocate a string of the correct size
let mut swapped = String::with_capacity(input_str.len());
// Pluck the previous character
let CharRange { ch: prev_ch, next: prev } = input_str.char_range_at_reverse(i);
// Pluck the current character
let CharRange { ch, next } = input_str.char_range_at(i);
// Put them back
swapped.push_str(&input_str[..prev]);
swapped.push(ch);
swapped.push(prev_ch);
swapped.push_str(&input_str[next..]);
// Done!
swapped
}
#[test]
fn smoke_test() {
let s = swap_chars_at("lyra", 2);
assert_eq!(s, "lrya");
}
#[test]
fn unicode() {
// 'ç' takes up 2 bytes in UTF-8
let s = swap_chars_at("ça va?", 2);
assert_eq!(s, "aç va?");
}
From the documentation:
fn char_range_at(&self, start: usize) -> CharRange
fn char_range_at_reverse(&self, start: usize) -> CharRange
Together, these two methods let us peek backwards and forwards in the string—which is exactly what we want.
But wait, there's more! DK pointed out a corner case with the above code. If the input contains any combining characters, they may become separated from the characters they combine with.
Now, this question is about Rust, not Unicode, so I won't go into the details of how exactly that works. All you need to know for now is that Rust provides this method:
fn grapheme_indices(&self, is_extended: bool) -> GraphemeIndices
With a healthy application of .find()
and .rev()
, we arrive at this (hopefully) correct solution:
#![allow(unstable)] // `GraphemeIndices` is unstable
fn swap_graphemes_at(input_str: &str, i: usize) -> String {
// Pre-allocate a string of the correct size
let mut swapped = String::with_capacity(input_str.len());
// Find the grapheme at index i
let (_, gr) = input_str.grapheme_indices(true)
.find(|&(index, _)| index == i)
.expect("index does not point to a valid grapheme");
// Find the grapheme just before it
let (prev, prev_gr) = input_str.grapheme_indices(true).rev()
.find(|&(index, _)| index < i)
.expect("no graphemes to swap with");
// Put it all back together
swapped.push_str(&input_str[..prev]);
swapped.push_str(gr);
swapped.push_str(prev_gr);
swapped.push_str(&input_str[i+gr.len()..]);
// Done!
swapped
}
#[test]
fn combining() {
// Ensure that "c\u{327}" is treated as a single unit
let s = swap_graphemes_at("c\u{327}a va?", 3);
assert_eq!(s, "ac\u{327} va?");
}
Admittedly it's a bit convoluted. First it iterates through the input, plucking out the grapheme cluster at i
. Then it iterates backward (.rev()
) through the input, picking the rightmost cluster with index < i
(i.e. the previous cluster). Finally it goes and puts everything back together.
If you're being really pedantic, there are still more special cases to deal with. For example, if the string contains Windows newlines ("\r\n"
), then we probably don't want to swap them around. And in Greek, the letter sigma (σ) is written differently when it's at the end of a word (ς), so a better algorithm should translate between them as necessary. And don't forget those bidirectional control characters...
But for the sake of our sanity, we'll stop here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With