Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to properly iterate with re.sub() in Python

Tags:

python

regex

I want to make a Python script that creates footnotes. The idea is to find all strings of the sort "Some body text.{^}{Some footnote text.}" and replace them with "Some body text.^#", where "^#" is the proper footnote number. (A different part of my script deals with actually printing out the footnotes at the bottom of the file.) The current code I'm using for this is:

pattern = r"\{\^\}\{(.*?)\}"
i = 0
def create_footnote_numbers(match):
   global i
   i += 1
   return "<sup>"+str(i)+"</sup>"

new_body_text = re.sub(pattern, create_footnote_numbers, text)

This works fine, but it seems weird to have to declare a variable (i) outside the create_footnote_numbers function and then have to call it inside that function. I would have thought there'd be something inside re that would return the number of the match.

like image 827
Alan Avatar asked May 26 '13 17:05

Alan


People also ask

How do you use the RE sub function in Python?

sub() function belongs to the Regular Expressions ( re ) module in Python. It returns a string where all matching occurrences of the specified pattern are replaced by the replace string. To use this function, we need to import the re module first.

What does re sub () do?

The re. sub() function is used to replace occurrences of a particular sub-string with another sub-string. This function takes as input the following: The sub-string to replace.


2 Answers

Any callable can be used, so you could use a class to track the numbering:

class FootnoteNumbers(object):
    def __init__(self, start=1):
        self.count = start - 1

    def __call__(self, match):
        self.count += 1
        return "<sup>{}</sup>".format(self.count)


new_body_text = re.sub(pattern, FootnoteNumbers(), text)

Now the counter state is contained in the FootnoteNumbers() instance, and self.count will be set anew each time you start a re.sub() run.

like image 70
Martijn Pieters Avatar answered Oct 17 '22 08:10

Martijn Pieters


It seems like a good fit for a closure:

def make_footnote_counter(start=1):
    count = [start - 1] # emulate nonlocal keyword
    def footnote_counter(match):
        count[0] += 1
        return "<sup>%d</sup>" % count[0]
    return footnote_counter

new_body_text = re.sub(pattern, make_footnote_counter(), text)
like image 44
jfs Avatar answered Oct 17 '22 08:10

jfs