Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python Conditional Regular Expression

Tags:

python

regex

This is a question involving a conditional regular expression in python:

I'd like to match the string "abc" with

match(1)="a"
match(2)="b"
match(3)="c"

but also match the string " a" with

match(1)="a"
match(2)=""
match(3)=""

The following code ALMOST does this, the problem is that in the first case match(1)="a" but in the second case, match(4)="a" (not match(1) as desired).

In fact, if you iterate through all the groups with for g in re.search(myre,teststring2).groups():, you get 6 groups (not 3 as was expected).

import re
import sys

teststring1 = "abc"
teststring2 = "  a"

myre = '^(?=(\w)(\w)(\w))|(?=\s{2}(\w)()())'

if re.search(myre,teststring1):
    print re.search(myre,teststring1).group(1)

if re.search(myre,teststring2):
   print re.search(myre,teststring2).group(1)

Any thoughts? (note this is for Python 2.5)

like image 450
Mike Avatar asked Jun 28 '10 05:06

Mike


1 Answers

Maybe...:

import re
import sys

teststring1 = "abc"
teststring2 = "  a"

myre = '^\s{0,2}(\w)(\w?)(\w?)$'

if re.search(myre,teststring1):
    print re.search(myre,teststring1).group(1)

if re.search(myre,teststring2):
   print re.search(myre,teststring2).group(1)

This does give a in both cases as you wish, but maybe it would not match the way you want in other cases you're not showing (e.g. with no spaces in front, or spaces and more than one letter afterwards, so that the total length of the matched string is != 3... but I'm just guessing that you don't want matches in such cases...?)

like image 142
Alex Martelli Avatar answered Sep 22 '22 11:09

Alex Martelli