What is significance of clearing cache while working with re
in Python.
Does it help in performance or memory management? What happens if we ignore it. Where should re.purge()
called?
Most code will not need to worry about purging the re
module cache. It brings very little memory benefit, and can actually hurt performance if you purged it.
The cache is used to store compiled regular expression objects when you use the top-level re.*
functions directly rather than use re.compile(pattern)
. For example, if you used re.search(r'<some pattern>', string_value)
in a loop, then the re
module would compile '<some pattern>'
only once and store it in the cache, avoiding having to re-compile the pattern each time.
How many such objects are cached and how the cache is managed is an implementation detail, really, but regular expression objects are light-weight objects, taking up at most a few hundred bytes, and Python won't store more than a few hundred of these (Python 3.7 stores up to 512).
The cache is also automatically managed, so purging is not normally needed at all. Use it if you specifically need to account for regular expression compilation time in a repeated time trial test involving re.*
functions, or are testing the caching functionality itself. The only locations in the Python standard library that call re.purge()
are in tests (specifically in the test_re
unittests for the re
module and the reference leak test in the regression test suite).
If your code is creating a lot of regular expression objects that you intent to keep using, it is better to use re.compile()
and keep your own references to those compiled expression objects. See the re.compile()
documentation:
The sequence
prog = re.compile(pattern) result = prog.match(string)
is equivalent to
result = re.match(pattern, string)
but using
re.compile()
and saving the resulting regular expression object for reuse is more efficient when the expression will be used several times in a single program.Note: The compiled versions of the most recent patterns passed to
re.compile()
and the module-level matching functions are cached, so programs that use only a few regular expressions at a time needn’t worry about compiling regular expressions.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With