Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Match two word in arbitrary order using regex

Tags:

python

regex

I have spent some time learning Regular Expression, but I still don't understand how the following trick works to match two words in different order.

import re
reobj = re.compile(r'^(?=.*?(John))(?=.*?(Peter)).*$',re.MULTILINE)

string = '''
John and Peter
Peter and John
James and Peter and John
'''
re.findall(reobj,string)

result

[('John', 'Peter'), ('John', 'Peter'), ('John', 'Peter')]

enter image description here

( https://www.regex101.com/r/qW4rF4/1)

I know the (?=.* ) part is called Positive Lookahead, but how does it work in this situation?

Any explanation?

like image 975
Aaron Avatar asked Apr 06 '15 10:04

Aaron


People also ask

How do you regex multiple words?

However, to recognize multiple words in any order using regex, I'd suggest the use of quantifier in regex: (\b(james|jack)\b. *){2,} . Unlike lookaround or mode modifier, this works in most regex flavours.

What does '$' mean in regex?

Literal Characters and Sequences For instance, you might need to search for a dollar sign ("$") as part of a price list, or in a computer program as part of a variable name. Since the dollar sign is a metacharacter which means "end of line" in regex, you must escape it with a backslash to use it literally.

What does regex 0 * 1 * 0 * 1 * Mean?

Basically (0+1)* mathes any sequence of ones and zeroes. So, in your example (0+1)*1(0+1)* should match any sequence that has 1. It would not match 000 , but it would match 010 , 1 , 111 etc. (0+1) means 0 OR 1.


1 Answers

It just does not match in any arbitrary order.Capturing here is being done by .* which consumes anything which comes its way.The positive lookahead makes an assertion .You have two lookaheads .They are independent of each other.Each makes an assertion one word.So finally your regex works like:

1)(?=.*?(John))===String should have a John.Just an assertion.Does not consume anything

2)(?=.*?(Peter))===String should have a Peter.Just an assertion.Does not consume anything

3).*===Consume anything if assertions have passed

So you see the order does not matter here.,what is imp is that assertions should pass.

like image 98
vks Avatar answered Oct 23 '22 18:10

vks