Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regular Expressions: Match up to a word or a maximum number of words

Tags:

python

regex

I want to look for a phrase, match up to a few words following it, but stop early if I find another specific phrase.

For example, I want to match up to three words following "going to the", but stop the matching process if I encounter "to try". So for example "going to the luna park" will result with "luna park"; "going to the capital city of Peru" will result with "capital city of" and "going to the moon to try some cheesecake" will result with "moon".

Can it be done with a single, simple regular expression (preferably in Python)? I've tried all the combinations I could think of, but failed miserably :).

like image 396
r0u1i Avatar asked Oct 22 '22 14:10

r0u1i


1 Answers

This one matches up to 3 ({1,3}) words following going to the as long as they are not followed by to try ((?!to try)):

import re
infile = open("input", "r")
for line in infile:
    m = re.match("going to the ((?:\w+\s*(?!to try)){1,3})", line)
    if m:
        print m.group(1).rstrip()

Output

luna park
capital city of
moon
like image 183
perreal Avatar answered Oct 24 '22 05:10

perreal