I am trying to create a report for all the text replacements done by the program using re.sub. I am not able to figure out how can I capture the replaced text in to a variable. Can any of you please help me in doing this? Please find the below code
import re
Report_file = open("report.txt", "w")
st = '''<item><AP>item1</AP><AP>Item2</AP><AP>item3</AP><AP>Item4</AP></item>'''
outval = re.sub(r'(?i)item1', "value1", st)
outval = re.sub(r'(?i)item2', "value2", outval)
outval = re.sub(r'(?i)item3', "value3", outval)
print outval
I want the report file in the below format
OLD: item1
NEW: value1
OLD: item2
NEW: value2
OLD: item3
NEW: value3
You need to use a function instead of a replace pattern instead:
def build_replacer(replacement):
def replace(match):
print match.group(), replacement
return replacement
return replace
Then run:
outval = re.sub(r'(?i)item1', build_replacer("value1"), st)
outval = re.sub(r'(?i)item2', build_replacer("value2"), outval)
outval = re.sub(r'(?i)item3', build_replacer("value3"), outval)
and it'll print the original text and it's replacement.
This then gives:
>>> st = '''<item><AP>item1</AP><AP>Item2</AP><AP>item3</AP><AP>Item4</AP></item>'''
>>> outval = re.sub(r'(?i)item1', build_replacer("value1"), st)
item1 value1
>>> outval = re.sub(r'(?i)item2', build_replacer("value2"), outval)
Item2 value2
>>> outval = re.sub(r'(?i)item3', build_replacer("value3"), outval)
item3 value3
>>> outval
'<item><AP>value1</AP><AP>value2</AP><AP>value3</AP><AP>Item4</AP></item>'
Instead of printing you could also store that information elsewhere, of course.
The build_replacer() function just returns a new function, replace(), which is what re.sub() will use whenever it finds a match. Instead of directly replacing the matched text, it asks the function what to use as a replacement text.
The reason we use build_replacer() here as a nested function, is so we can store the fixed replacement text somewhere and re-use the same replacement function over and over again without having to hardcode the replacement text.
In your question and in the answer, you'll have to write as many instructions
outval = re.sub(r'(?i)item3', .......... )
as there are items to replace.
What if there are 56 items to replace ?
.
In my following solution , there are 5 items to replace, but the instruction
r.sub(fruiting,text) is written only one time:
text = '''
OR 125
BA 48
Pr 12
ba 4
Cherry 147
Ba 10
Or 7
OR 6
Orange 2
AP 9
PR 3
Banana 101
or 22
pR 13
'''
. import re
the_items = ('OR','BA','AP','PR','CH')
new_items = ('Orange','Banana','Apple','Pear','Cherry')
corresp = dict(zip(the_items,new_items))
r = re.compile('(%s) *(\d+)' % '|'.join(the_items),
re.IGNORECASE)
def fruiting(ma,longname = corresp):
fresh = '%-12s %s' % (longname[ma.group(1).upper()],
ma.group(2) )
tu = ('OLD: %r\n'
'NEW: %r\n'
%
( ma.group(),fresh) )
print tu
return fresh
print '%s%s' % (text, r.sub(fruiting,text))
result
OLD: 'OR 125'
NEW: 'Orange 125'
OLD: 'BA 48'
NEW: 'Banana 48'
OLD: 'Pr 12'
NEW: 'Pear 12'
OLD: 'ba 4'
NEW: 'Banana 4'
OLD: 'Ba 10'
NEW: 'Banana 10'
OLD: 'Or 7'
NEW: 'Orange 7'
OLD: 'OR 6'
NEW: 'Orange 6'
OLD: 'AP 9'
NEW: 'Apple 9'
OLD: 'PR 3'
NEW: 'Pear 3'
OLD: 'or 22'
NEW: 'Orange 22'
OLD: 'pR 13'
NEW: 'Pear 13'
.
OR 125
BA 48
Pr 12
ba 4
Cherry 147
Ba 10
Or 7
OR 6
Orange 2
AP 9
PR 3
Banana 101
or 22
pR 13
Orange 125
Banana 48
Pear 12
Banana 4
Cherry 147
Banana 10
Orange 7
Orange 6
Orange 2
Apple 9
Pear 3
Banana 101
Orange 22
Pear 13
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With