Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using a global flag for python RegExp compile

Tags:

python

regex

Would it be possible to define a global flag so that Python's re.compile() automatically sets it ? For instance I want to use re.DOTALL flag for all my RegExp in -- say -- a class?

It may sound weird at first, but I'm not really in control of this part of the code since it's generated by YAPPS. I just give YAPPS a string containing a RegExp and it calls re.compile(). Alas, I need to use it in re.DOTALL mode.

A quick fix would be to edit the generated parser and add the proper option. But I still hope there's another and more automated way to do this.

EDIT: Python allows you to set flags with the (?...) construct, so in my case re.DOTALL is (?s). Though usefull, it doesn't apply on a whole class, or on a file.

So my question still holds.

like image 560
MP0 Avatar asked Aug 04 '11 15:08

MP0


People also ask

How do you use flags in regex Python?

M flag is used as an argument inside the regex method to perform a match inside a multiline block of text. Note: This flag is used with metacharacter ^ and $ . When this flag is specified, the pattern character ^ matches at the beginning of the string and each newline's start ( \n ).

What does the global flag in regex do?

The g flag indicates that the regular expression should be tested against all possible matches in a string. Each call to exec() will update its lastIndex property, so that the next call to exec() will start at the next character.

Does glob use regex?

The pattern rules for glob are not regular expressions. Instead, they follow standard Unix path expansion rules. There are only a few special characters: two different wild-cards, and character ranges are supported.


2 Answers

Yes, you can change it to be globally re.DOTALL. But you shouldn't. Global settings are a bad idea at the best of times -- this could cause any Python code run by the same instance of Python to break.


So, don't do this:

The way you can change it is to use the fact that the Python interpreter caches modules per instance, so that if somebody else imports the same module they get the object to which you also have access. So you could rebind re.compile to a proxy function that passes re.DOTALL.

import re
re.my_compile = re.compile
re.compile = lambda pattern, flags: re.my_compile(pattern, flags | re.DOTALL)

and this change will happen to everybody else.

You can even package this up in a context manager, as follows:

from contextlib import contextmanager

@contextmanager
def flag_regexen(flag):
    import re
    re.my_compile = re.compile
    re.compile = lambda pattern, flags: re.my_compile(pattern, flags | flag)
    yield
    re.compile = re.my_compile

and then

with flag_regexen(re.DOTALL):
    <do stuff with all regexes DOTALLed>
like image 175
Katriel Avatar answered Sep 20 '22 05:09

Katriel


All of the flags can be set in the regular expression itself:

r"(?s)Your.*regex.*here"
like image 39
Wooble Avatar answered Sep 20 '22 05:09

Wooble