Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it safe to read regular expressions from a file?

Assuming a Perl script that allows users to specify several text filter expressions in a config file, is there a safe way to let them enter regular expressions as well, without the possibility of unintended side effects or code execution? Without actually parsing the regexes and checking them for problematic constructs, that is. There won't be any substitution, only matching.

As an aside, is there a way to test if the specified regex is valid before actually using it? I'd like to issue warnings if something like /foo (bar/ was entered.

Thanks, Z.


EDIT:
Thanks for the very interesting answers. I've since found out that the following dangerous constructs will only be evaluated in regexes if the use re 'eval' pragma is used:
(?{code})
(??{code})
${code}
@{code}

The default is no re 'eval'; so unless I'm missing something, it should be safe to read regular expressions from a file, with the only check being the eval/catch posted by Axeman. At least I haven't been able to hide anything evil in them in my tests.

Thanks again. Z.

like image 971
Zilk Avatar asked Oct 28 '08 03:10

Zilk


People also ask

Is regex secure?

The Regex class itself is thread safe and immutable (read-only). That is, Regex objects can be created on any thread and shared between threads; matching methods can be called from any thread and never alter any global state.

Why you should not parse HTML with regex?

Regular expressions are a tool that is insufficiently sophisticated to understand the constructs employed by HTML. HTML is not a regular language and hence cannot be parsed by regular expressions. Regex queries are not equipped to break down HTML into its meaningful parts.

Are regular expressions useful?

Regular expressions are useful in search and replace operations. The typical use case is to look for a sub-string that matches a pattern and replace it with something else. Most APIs using regular expressions allow you to reference capture groups from the search pattern in the replacement string.

Does regex affect performance?

Being more specific with your regular expressions, even if they become much longer, can make a world of difference in performance. The fewer characters you scan to determine the match, the faster your regexes will be.


1 Answers

Depending on what you're matching against, and the version of Perl you're running, there might be some regexes that act as an effective denial of service attack by using excessive lookaheads, lookbehinds, and other assertions.

You're best off allowing only a small, well-known subset of regex patterns, and expanding it cautiously as you and your users learn how to use the system. In the same way that many blog commenting systems allow only a small subset of HTML tags.

Eventually Parse::RecDescent might become useful, if you need to do complex analysis of regexes.

like image 73
Sam Kington Avatar answered Oct 09 '22 19:10

Sam Kington