Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regular expression: match start or whitespace

Tags:

python

regex

Can a regular expression match whitespace or the start of a string?

I'm trying to replace currency the abbreviation GBP with a £ symbol. I could just match anything starting GBP, but I'd like to be a bit more conservative, and look for certain delimiters around it.

>>> import re >>> text = u'GBP 5 Off when you spend GBP75.00'  >>> re.sub(ur'GBP([\W\d])', ur'£\g<1>', text) # matches GBP with any prefix u'\xa3 5 Off when you spend \xa375.00'  >>> re.sub(ur'^GBP([\W\d])', ur'£\g<1>', text) # matches at start only u'\xa3 5 Off when you spend GBP75.00'  >>> re.sub(ur'(\W)GBP([\W\d])', ur'\g<1>£\g<2>', text) # matches whitespace prefix only u'GBP 5 Off when you spend \xa375.00' 

Can I do both of the latter examples at the same time?

like image 469
Mat Avatar asked Feb 08 '09 12:02

Mat


People also ask

Does regex match dot space?

Yes, the dot regex matches whitespace characters when using Python's re module. What is this? The dot matches all characters in the string --including whitespaces. You can see that there are many whitespace characters ' ' among the matched characters.

What is the regex for whitespace?

\s stands for “whitespace character”. Again, which characters this actually includes, depends on the regex flavor. In all flavors discussed in this tutorial, it includes [ \t\r\n\f]. That is: \s matches a space, a tab, a carriage return, a line feed, or a form feed.

Does empty regex match everything?

An empty regular expression matches everything.


1 Answers

Use the OR "|" operator:

>>> re.sub(r'(^|\W)GBP([\W\d])', u'\g<1>£\g<2>', text) u'\xa3 5 Off when you spend \xa375.00' 
like image 134
Zach Scrivena Avatar answered Sep 27 '22 18:09

Zach Scrivena