I have the following regular expression:
pattern = '^[a-zA-Z0-9-_]*_(?P<pos>[A-Z]\d\d)_T\d{4}(?P<fID>F\d{3})L\d{2}A\d{2}(?P<zID>Z\d{2})(?P<cID>C\d{2})\.tif$'
that matches file names like these:
filename = '151006_655866_Z01_T0001F015L01A02Z01C03.tif'
with groups:
m = re.match(pattern, filename)
print m.group("pos") # Z01
print m.group("fID") # F015
print m.group("zID") # Z01
How can I replace only a specified group with a given string in Python?
I tried to use re.sub
with a function call, but don't know how this function should look like:
def replace_function(matchobj):
# how to replace only a given match group?
# (the following replaces *all* occurrences of "Z01" in this example)
return matchobj.group(0).replace(matchobj.group("slice"), "---")
print re.sub(pattern, replace_function, filename)
My desired result would be:
151006_655866_Z01_T0001F015L01A02---C03.tif
You can do what you need using a closure and the start/end index of the chosen matching group:
import re
from functools import partial
pattern = '^[\w-]*_(?P<pos>[A-Z]\d{2})_T\d{4}(?P<fID>F\d{3})L\d{2}A\d{2}(?P<zID>Z\d{2})(?P<cID>C\d{2})\.tif$'
filename = '151006_655866_Z01_T0001F015L01A02Z01C03.tif'
def replace_closure(subgroup, replacement, m):
if m.group(subgroup) not in [None, '']:
start = m.start(subgroup)
end = m.end(subgroup)
return m.group()[:start] + replacement + m.group()[end:]
subgroup_list = ['pos', 'fID', 'zID', 'cID']
replacement = '---'
for subgroup in subgroup_list:
print re.sub(pattern, partial(replace_closure, subgroup, replacement), filename)
Output:
151006_655866_---_T0001F015L01A02Z01C03.tif
151006_655866_Z01_T0001---L01A02Z01C03.tif
151006_655866_Z01_T0001F015L01A02---C03.tif
151006_655866_Z01_T0001F015L01A02Z01---.tif
An online implementation is available here
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With