Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Running a number of consecutive replacements on the same string

Tags:

rust

I found this example for substring replacement:

use std::str;
let string = "orange";
let new_string = str::replace(string, "or", "str");

If I want to run a number of consecutive replacements on the same string, for sanitization purposes, how can I do that without allocating a new variable for each replacement?

If you were to write idiomatic Rust, how would you write multiple chained substring replacements?

like image 826
mkaito Avatar asked Dec 15 '14 03:12

mkaito


1 Answers

I would not use regex or .replace().replace().replace() or .maybe_replace().maybe_replace().maybe_replace() for this. They all have big flaws.

  • Regex is probably the most reasonable option but regexes are just a terrible terrible idea if you can at all avoid them. If your patterns come from user input then you're going to have to deal with escaping them which is a security nightmare.
  • .replace().replace().replace() is terrible for obvious reasons.
  • .maybe_replace().maybe_replace().maybe_replace() is only very slightly better than that, because it only improves efficiency when a pattern doesn't match. It doesn't avoid the repeated allocations if they all match, and in that case it is actually worse because it searches the strings twice.

There's a much better solution: Use the AhoCarasick crate. There's even an example in the readme:

use aho_corasick::AhoCorasick;

let patterns = &["fox", "brown", "quick"];
let haystack = "The quick brown fox.";
let replace_with = &["sloth", "grey", "slow"];

let ac = AhoCorasick::new(patterns);
let result = ac.replace_all(haystack, replace_with);
assert_eq!(result, "The slow grey sloth.");

for sanitization purposes

I should also say that blacklisting "bad" strings is completely the wrong way to do sanitisation.

like image 68
Timmmm Avatar answered Oct 06 '22 04:10

Timmmm