Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Combine Regexp?

Tags:

regex

After collecting user input for various conditions like

  1. Starts with : /(^@)/
  2. Ends with : /(@$)/
  3. Contains : /@/
  4. Doesn't contains

To make single regex if user enter multiple conditions, I combine them with "|" so if 1 and 2 given it become /(^@)|(@$)/

This method works so far but,

I'm not able to determine correctly, What should be the regex for the 4th condition? And combining regex this way work?


Update: @(user input) won't be same for two conditions and not all four conditions always present but they can be and in future I might need more conditions like "is exactly" and "is exactly not" etc. so, I'm more curious to know this approach will scale ?

Also there may be issues of user input cleanup so regex escaped properly, but that is ignored right now.

like image 804
nexneo Avatar asked May 15 '09 17:05

nexneo


People also ask

Can you combine regex?

Finally, you can also combine regular expressions. In general, if A and B are regexes, and a is a match for A, and b is a match for B, then AB (the concatenation of A and B) is a regex, and ab is a match for AB .

How do I combine regex patterns?

to combine two expressions or more, put every expression in brackets, and use: *? This are the signs to combine, in order of relevance: ?

How do you chain in regex?

Chaining regular expressions Regular expressions can be chained together using the pipe character (|). This allows for multiple search options to be acceptable in a single regex string.

What does '$' mean in regex?

$ means "Match the end of the string" (the position after the last character in the string).


1 Answers

Will the conditions be ORed or ANDed together?

Starts with: abc Ends with: xyz Contains: 123 Doesn't contain: 456

The OR version is fairly simple; as you said, it's mostly a matter of inserting pipes between individual conditions. The regex simply stops looking for a match as soon as one of the alternatives matches.

/^abc|xyz$|123|^(?:(?!456).)*$/ 

That fourth alternative may look bizarre, but that's how you express "doesn't contain" in a regex. By the way, the order of the alternatives doesn't matter; this is effectively the same regex:

/xyz$|^(?:(?!456).)*$|123|^abc/ 

The AND version is more complicated. After each individual regex matches, the match position has to be reset to zero so the next regex has access to the whole input. That means all of the conditions have to be expressed as lookaheads (technically, one of them doesn't have to be a lookahead, I think it expresses the intent more clearly this way). A final .*$ consummates the match.

/^(?=^abc)(?=.*xyz$)(?=.*123)(?=^(?:(?!456).)*$).*$/ 

And then there's the possibility of combined AND and OR conditions--that's where the real fun starts. :D

like image 66
Alan Moore Avatar answered Sep 20 '22 17:09

Alan Moore