Perform simple math on regular expression output? (Python)

Question

Is it possible to perform simple math on the output from Python regular expressions?

I have a large file where I need to divide numbers following a ")" by 100. For instance, I would convert the following line containing )75 and )2:

((words:0.23)75:0.55(morewords:0.1)2:0.55);

to )0.75 and )0.02:

((words:0.23)0.75:0.55(morewords:0.1)0.02:0.55);

My first thought was to use re.sub using the search expression "\)\d+", but I don't know how to divide the integer following the parenthesis by 100, or if this is even possible using re.

Any thoughts on how to solve this? Thanks for your help!

David Robinson · Accepted Answer

You can do it by providing a function as the replacement:

s = "((words:0.23)75:0.55(morewords:0.1)2:0.55);"

s = re.sub("\)(\d+)", lambda m: ")" + str(float(m.groups()[0]) / 100), s)

print s
# ((words:0.23)0.75:0.55(morewords:0.1)0.02:0.55);

Incidentally, if you wanted to do it using BioPython's Newick tree parser instead, it would look like this:

from Bio import Phylo
# assuming you want to read from a string rather than a file
from StringIO import StringIO

tree = Phylo.read(StringIO(s), "newick")

for c in tree.get_nonterminals():
    if c.confidence != None:
        c.confidence = c.confidence / 100

print tree.format("newick")

(while this particular operation takes more lines than the regex version, other operations involving trees might be made much easier with it).

BrenBarn · Answer

The replacement expression for re.sub can be a function. Write a function that takes the matched text, converts it to a number, divides it by 100, and then returns the string form of the result.

Perform simple math on regular expression output? (Python)

Tags:

python

regex

chimeric

2 Answers

David Robinson

BrenBarn

Recent Activity

Donate For Us

Perform simple math on regular expression output? (Python)

Tags:

python

regex

chimeric

2 Answers

David Robinson

BrenBarn

Related questions

Recent Activity

Donate For Us