I have spent some time learning Regular Expression, but I still don't understand how the following trick works to match two words in different order. <pre class="prettyprint"><code>import re reobj = re.compile(r'^(?=.*?(John))(?=.*?(Peter)).*$',re.MULTILINE) string = ''' John and Peter Peter and John James and Peter and John ''' re.findall(reobj,string) </code></pre> <hr> result <pre class="prettyprint"><code>[('John', 'Peter'), ('John', 'Peter'), ('John', 'Peter')] </code></pre> <img src="https://i.stack.imgur.com/77o3H.png" alt="enter image description here"> ( https://www.regex101.com/r/qW4rF4/1) I know the <code>(?=.* )</code> part is called <code>Positive Lookahead</code>, but how does it work in this situation? Any explanation?

It just does not match in any arbitrary order.Capturing here is being done by <code>.*</code> which consumes anything which comes its way.The <code>positive lookahead</code> makes an assertion .You have two <code>lookaheads</code> .They are independent of each other.Each makes an assertion one word.So finally your regex works like: 1)<code>(?=.*?(John))</code>===String should have a <code>John</code>.Just an assertion.Does not consume anything 2)<code>(?=.*?(Peter))</code>===String should have a <code>Peter</code>.Just an assertion.Does not consume anything 3)<code>.*</code>===Consume anything if assertions have passed So you see the order does not matter here.,what is imp is that <code>assertions should pass</code>.

Match two word in arbitrary order using regex

Tags:

python

regex

I have spent some time learning Regular Expression, but I still don't understand how the following trick works to match two words in different order.

import re
reobj = re.compile(r'^(?=.*?(John))(?=.*?(Peter)).*$',re.MULTILINE)

string = '''
John and Peter
Peter and John
James and Peter and John
'''
re.findall(reobj,string)

result

[('John', 'Peter'), ('John', 'Peter'), ('John', 'Peter')]

enter image description here

( https://www.regex101.com/r/qW4rF4/1)

I know the (?=.* ) part is called Positive Lookahead, but how does it work in this situation?

Any explanation?

975

asked Apr 06 '15 10:04

Aaron

1 Answers

It just does not match in any arbitrary order.Capturing here is being done by .* which consumes anything which comes its way.The positive lookahead makes an assertion .You have two lookaheads .They are independent of each other.Each makes an assertion one word.So finally your regex works like:

1)(?=.*?(John))===String should have a John.Just an assertion.Does not consume anything

2)(?=.*?(Peter))===String should have a Peter.Just an assertion.Does not consume anything

3).*===Consume anything if assertions have passed

So you see the order does not matter here.,what is imp is that assertions should pass.

answered Oct 23 '22 18:10

vks

Related questions
                            
                                Can't create test client during unit test of Flask app
                            
                                pygrib GRIB2 segmenation fault while reading data
                            
                                Generate consistent person data
                            
                                Django CachedStaticFilesStorage in ModelAdmin Media
                            
                                Are C++-style internal typedefs possible in Cython?
                            
                                Sublime Text 3 - Plugin Profiles
                            
                                Most performant calculation of Newtonian forces in numpy/scipy
                            
                                Why use curses.ascii.islower?
                            
                                How does `pip search` sort results?
                            
                                Scope for "raise" without arguments in nested exception handlers in Python 2 and 3
                            
                                Multiple inheritance in scrapy spiders
                            
                                time series analysis with statsmodels
                            
                                Subtracting two variables in django template [duplicate]
                            
                                TKinter: Can I style submenus to look like normal menus
                            
                                How to extend a Boto3 resource?
                            
                                Python - SystemError: NULL result without error in PyObject call
                            
                                append rows to a Pandas groupby object
                            
                                How to hide something that you have already printed in Python [duplicate]
                            
                                Break down cubes into 8 smaller cubes recursively (when the cubes are defined by a mid point and size)
                            
                                Django-rest-framework: set default renderer not working?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With