Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can you pass a dictionary when replacing strings in Python?

In PHP, you have preg_replace($patterns, $replacements, $string), where you can make all your substitutions at once by passing in an array of patterns and replacements.

What is the equivalent in Python?

I noticed that the string and re functions replace() and sub() don't take dictionaries...

Edited to clarify based on a comment by rick: the idea is to have a dict with keys to be taken as regular expression patterns, such as '\d+S', and (hopefully) constant string values (hopefully w/o backreferences). Now editing my answer accordingly (i.e. to answer the actual question).

like image 462
rick Avatar asked Jun 02 '09 02:06

rick


1 Answers

closest is probably:

somere.sub(lambda m: replacements[m.group()], text)

for example:

>>> za = re.compile('z\w')
>>> za.sub(lambda m: dict(za='BLU', zo='BLA')[m.group()], 'fa za zo bu')
'fa BLU BLA bu'

with a .get instead of []-indexing if you want to supply a default for matches that are missing in replacements.

Edit: what rick really wants is to have a dict with keys to be taken as regular expression patterns, such as '\d+S', and (hopefully) constant string values (hopefully w/o backreferences). The cookbook recipe can be adapted for this purpose:

def dict_sub(d, text): 
  """ Replace in 'text' non-overlapping occurences of REs whose patterns are keys
  in dictionary 'd' by corresponding values (which must be constant strings: may
  have named backreferences but not numeric ones). The keys must not contain
  anonymous matching-groups.
  Returns the new string.""" 

  # Create a regular expression  from the dictionary keys
  regex = re.compile("|".join("(%s)" % k for k in d))
  # Facilitate lookup from group number to value
  lookup = dict((i+1, v) for i, v in enumerate(d.itervalues()))

  # For each match, find which group matched and expand its value
  return regex.sub(lambda mo: mo.expand(lookup[mo.lastindex]), text)

Example use:

  d={'\d+S': 'wot', '\d+T': 'zap'}
  t='And 23S, and 45T, and 66T but always 029S!'
  print dict_sub(d, t)

emits:

And wot, and zap, and zap but always wot!

You could avoid building lookup and just use mo.expand(d.values()[mo.lastindex-1]), but that might be a tad slow if d is very large and there are many matches (sorry, haven't precisely measured/benchmarked both approaches, so this is just a guess;-).

like image 85
Alex Martelli Avatar answered Oct 20 '22 00:10

Alex Martelli