Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I create a Regex from a user-provided string which contains regex metacharacters?

Tags:

regex

rust

I need to create a regular expression using the regex crate which includes a string passed as a command line argument to the program. The command line argument can contain $ and {}.

If I hard code the string as r"...", then it works fine, but if I use the command line argument as format!(r#"{}"#, arg_str), I get the following error (assuming arg_str = ${replace}) :

thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Syntax(
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
regex parse error:
    ${replace}
      ^
error: decimal literal empty
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
)', libcore/result.rs:945:5
note: Run with `RUST_BACKTRACE=1` for a backtrace.

Simplified code example to demonstrate this issue:

extern crate regex;
use regex::Regex;

fn main() {
    let args: Vec<_> = std::env::args().collect();
    let ref arg_str = args[1];

    let re = Regex::new(format!(r#"{}"#, arg_str).as_str()).unwrap();
    println!("{:?}", re);
}

If this is run with a simple argument like replace, there is no error, but if I pass it something like ${replace}, I get the error mentioned above.

like image 967
schaazzz Avatar asked May 26 '17 07:05

schaazzz


1 Answers

The regex crate has a function escape which does what you need.

From the documentation:

Function regex::escape

pub fn escape(text: &str) -> String

Escapes all regular expression meta characters in text.
The string returned may be safely used as a literal in a regular expression.

So passing your arg_str through regex::escape should fix your problem.

like image 99
belst Avatar answered Oct 20 '22 06:10

belst