Logo Questions Linux Laravel Mysql Ubuntu Git Menu

Python regex and replace with incremented number




I have a file which contains in it several lines of problematic syntax, I'd like to do find all occurrences of it and replace it with acceptable syntax.


<field id="someId" type="xs:decimal" bind="someId">
    <region id="Calc.R_315.`0" page="1"/>
    <region id="Calc.R_315.`1" page="1"/>

I'd like to a string replacement of all occurrences of

<dot><tick><number> i.e. .`0 or .`1 or .`2 et cetera


<dash><number> i.e. -1 or -2 or -3

Notice it begins at 1 instead of 0.

I have the following python code which performs an inline replacement of however it starts at 0, I'd like it to start at 1.

with fileinput.input(files="file.xml", inplace=True, backup='.original.bak', mode='r') as f:
    for line in f:
        pattern = "\.`(\d+)"
        result = re.sub(pattern, lambda exp: "-{}".format(exp.groups()[0]), line)
        print(result, end='')

How to accomplish my goal?

like image 753
arabian_albert Avatar asked Feb 06 '18 17:02


People also ask

Can regex be used with replace in Python?

Regex can be used to perform various tasks in Python. It is used to do a search and replace operations, replace patterns in text, check if a string contains the specific pattern.

How do you increment a number in a string in Python?

To increment a character in a Python, we have to convert it into an integer and add 1 to it and then cast the resultant integer to char. We can achieve this using the builtin methods ord and chr.

How do you add numbers in regex?

For each step add the regex (without delimiters), the modifiers and the substitution string. For the above example this would be (6 + 1 + 3) + (3 + 0 + 2) + (2 + 1 + 0) = 18 .

Is regex faster than string replace Python?

Two-regex is now 15.9 times slower than string replacement, and regex/lambda 38.8 times slower.

2 Answers

You are almost at the solution yourself!

The only thing remaining is to convert the captured number into an int, and add 1 to it. Simple!

So the relevant line of code becomes:

result = re.sub(pattern, lambda exp: "-{}".format(int(exp.groups()[0]) + 1), line)

Another slight modification that can be made is to change .groups()[0] to .group(1). You can learn more about group and its usage in the documentation.

One last thing: It is always better to define your regex pattern as a raw string so as to avoid any future headaches.

like image 180
nisemonoxide Avatar answered Sep 26 '22 15:09


You can try this:

import re
s = """
<field id="someId" type="xs:decimal" bind="someId">
   <region id="Calc.R_315.`0" page="1"/>
   <region id="Calc.R_315.`1" page="1"/>
new_s = re.sub('\.`\d+', '{}', s).format(*map(lambda x:'-{}'.format(int(x)+1), re.findall('(?<=\.`)\d+(?=")', s)))


<field id="someId" type="xs:decimal" bind="someId">
  <region id="Calc.R_315-1" page="1"/>
  <region id="Calc.R_315-2" page="1"/>
like image 36
Ajax1234 Avatar answered Sep 26 '22 15:09
