Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Multiple re.sub() statements

In my program, the user enters a term which I process before sending on. Part of this process is to change all instances of 'and','or' and 'not' to uppercase letters but leaving the rest intact.

I can't use string.upper() because it changes everything to uppercase; or string.replace() because if 'and' is in another word in the string e.g. 'salamander' it will also change that to 'salamANDer'. I think my best option is the regex re.sub() function. This allows me to change full words which is perfect. Next problem: I have to do a re.sub() function for each change I want to make. Is it possible to make one statement to do all the changes? What I have done isn't wrong but I don't think its necessarily good practice:

>>import urllib2
>>import re
>>query = 'Lizards and Amphibians not salamander or newt'
>>query=re.sub(r'\bnot\b', 'NOT',query)
>>query=re.sub(r'\bor\b', 'OR',query)
>>query=re.sub(r'\band\b', 'AND',query)
>>query = urllib2.quote("'"+query+"'")

>>print query
%27Lizards%20AND%20Amphibians%20NOT%20salamander%20OR%20newt%27
like image 550
adohertyd Avatar asked Dec 03 '22 02:12

adohertyd


1 Answers

You can pass a function substitution expression in re.sub():

>>> term = "Lizards and Amphibians not salamander or newt"
>>> re.sub(r"\b(not|or|and)\b", lambda m: m.group().upper(), term)
'Lizards AND Amphibians NOT salamander OR newt'

However, I'd probably go with a non-regex solution:

>>> " ".join(s.upper() if s.lower() in ["and", "or", "not"] else s
...          for s in term.split())
'Lizards AND Amphibians NOT salamander OR newt'

This also normalizes the whitespace and works with mixed-case words like And.

like image 111
Sven Marnach Avatar answered Jan 03 '23 05:01

Sven Marnach