Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Compiling a regex inside a function that's called multiple times

Tags:

python

regex

If you compile a regex inside a function, and that function gets called multiple times, does Python recompile the regex each time, or does Python cache the compiled regex (assuming the regex doesn't change)?

For example:

def contains_text_of_interest(line):
    r = re.compile(r"foo\dbar\d")  
    return r.match(line)

def parse_file(fname):
    for line in open(fname):
        if contains_text_of_interest(line):
           # Do something interesting
like image 572
Lorin Hochstein Avatar asked Aug 06 '10 20:08

Lorin Hochstein


People also ask

How do you use re sub?

Basic usage In re. sub() , specify a regex pattern in the first argument, a new string in the second, and a string to be processed in the third. As with replace() , you can specify the maximum count of replacements in the fourth parameter, count . You can also create a regular expression pattern object with re.

WHAT IS RE sub in Python?

sub() function belongs to the Regular Expressions ( re ) module in Python. It returns a string where all matching occurrences of the specified pattern are replaced by the replace string. To use this function, we need to import the re module first. import re.

How does re compile work in Python?

The re. re. compile(pattern, repl, string): We can combine a regular expression pattern into pattern objects, which can be used for pattern matching. It also helps to search a pattern again without rewriting it.


1 Answers

Actually, if you look at the code in the re module, the re.compile function uses the cache just as all the other functions do, so compiling the same regex over and over again is very very cheap (a dictionary lookup). In other words, write the code to be the most understandable or maintainable or expressive, and don't worry about the overhead of compiling regexes.

like image 136
Ned Batchelder Avatar answered Sep 29 '22 16:09

Ned Batchelder