Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex: I want this AND that AND that... in any order

Tags:

c#

regex

I'm not even sure if this is possible or not, but here's what I'd like.

String: "NS306 FEBRUARY 20078/9/201013B1-9-1Low31 AUGUST 19870" 

I have a text box where I type in the search parameters and they are space delimited. Because of this, I want to return a match is string1 is in the string and then string2 is in the string, OR string2 is in the string and then string1 is in the string. I don't care what order the strings are in, but they ALL (will somethings me more than 2) have to be in the string.

So for instance, in the provided string I would want:

"FEB Low" 

or

"Low FEB" 

...to return as a match.

I'm REALLY new to regex, only read some tutorials on here but that was a while ago and I need to get this done today. Monday I start a new project which is much more important and can't be distracted with this issue. Is there anyway to do this with regular expressions, or do I have to iterate through each part of the search filter and permutate the order? Any and all help is extremely appreciated. Thanks.

UPDATE: The reason I don't want to iterate through a loop and am looking for the best performance wise is because unfortunately, the dataTable I'm using calls this function on every key press, and I don't want it to bog down.

UPDATE: Thank you everyone for your help, it was much appreciated.

CODE UPDATE:

Ultimately, this is what I went with.

string sSearch = nvc["sSearch"].ToString().Replace(" ", ")(?=.*"); if (sSearch != null && sSearch != "") {   Regex r = new Regex("^(?=.*" + sSearch + ").*$", RegexOptions.IgnoreCase);   _AdminList = _AdminList.Where<IPB>(                                        delegate(IPB ipb)                                        {                                           //Concatenated all elements of IPB into a string                                           bool returnValue = r.IsMatch(strTest); //strTest is the concatenated string                                           return returnValue;                                     }).ToList<IPB>();                                        } } 

The IPB class has X number of elements and in no one table throughout the site I'm working on are the columns in the same order. Therefore, I needed to any order search and I didn't want to have to write a lot of code to do it. There were other good ideas in here, but I know my boss really likes Regex (preaches them) and therefore I thought it'd be best if I went with that for now. If for whatever reason the site's performance slips (intranet site) then I'll try another way. Thanks everyone.

like image 638
XstreamINsanity Avatar asked Aug 20 '10 17:08

XstreamINsanity


People also ask

What does \+ mean in regex?

Example: The regex "aa\n" tries to match two consecutive "a"s at the end of a line, inclusive the newline character itself. Example: "a\+" matches "a+" and not a series of one or "a"s. ^ the caret is the anchor for the start of the string, or the negation symbol.

What does the regex 0 9 ]+ do?

In this case, [0-9]+ matches one or more digits. A regex may match a portion of the input (i.e., substring) or the entire input. In fact, it could match zero or more substrings of the input (with global modifier). This regex matches any numeric substring (of digits 0 to 9) of the input.

How do you regex multiple words?

However, to recognize multiple words in any order using regex, I'd suggest the use of quantifier in regex: (\b(james|jack)\b. *){2,} . Unlike lookaround or mode modifier, this works in most regex flavours.

What does regex 0 * 1 * 0 * 1 * Mean?

Basically (0+1)* mathes any sequence of ones and zeroes. So, in your example (0+1)*1(0+1)* should match any sequence that has 1. It would not match 000 , but it would match 010 , 1 , 111 etc. (0+1) means 0 OR 1.


2 Answers

You can use (?=…) positive lookahead; it asserts that a given pattern can be matched. You'd anchor at the beginning of the string, and one by one, in any order, look for a match of each of your patterns.

It'll look something like this:

^(?=.*one)(?=.*two)(?=.*three).*$ 

This will match a string that contains "one", "two", "three", in any order (as seen on rubular.com).

Depending on the context, you may want to anchor on \A and \Z, and use single-line mode so the dot matches everything.

This is not the most efficient solution to the problem. The best solution would be to parse out the words in your input and putting it into an efficient set representation, etc.

Related questions

  • How does the regular expression (?<=#)[^#]+(?=#) work?

More practical example: password validation

Let's say that we want our password to:

  • Contain between 8 and 15 characters
  • Must contain an uppercase letter
  • Must contain a lowercase letter
  • Must contain a digit
  • Must contain one of special symbols

Then we can write a regex like this:

^(?=.{8,15}$)(?=.*[A-Z])(?=.*[a-z])(?=.*[0-9])(?=.*[!@#$%^&*]).*$  \__________/\_________/\_________/\_________/\______________/     length      upper      lower      digit        symbol 
like image 167
polygenelubricants Avatar answered Sep 29 '22 06:09

polygenelubricants


Why not just do a simple check for the text since order doesn't matter?

string test = "NS306 FEBRUARY 20078/9/201013B1-9-1Low31 AUGUST 19870"; test = test.ToUpper(); bool match = ((test.IndexOf("FEB") >= 0) && (test.IndexOf("LOW") >= 0)); 

Do you need it to use regex?

like image 20
Kelsey Avatar answered Sep 29 '22 07:09

Kelsey