Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why re.escape escapes space

Tags:

python

regex

Recently, I just found re.escape is useful to get regular expression from a string quickly. When I pass a string like 'a b c', I'm confused that why every space is escaped with \ character. AFAIK, writing an equivalent expression to match that string, it's unnecessary to escape the space character. Why does this difference happen? Thanks.

like image 570
Chenxiong Qi Avatar asked Sep 06 '15 03:09

Chenxiong Qi


People also ask

What does re escape do?

re. escape helps if you are using input strings sourced from elsewhere to build the final RE. You also saw how to use escape sequences to represent characters and how they differ from normal string literals.

How do you escape special characters in regex python?

escape() was changed to escape only characters which are meaningful to regex operations. Note that re. escape will turn e.g. a newline into a backslash followed by a newline; one might well instead want a backslash followed by a lowercase n.

How do you escape parentheses in Python?

Python Regex Escape Pipe You can get rid of the special meaning of the pipe symbol by using the backslash prefix: \| . This way, you can match the parentheses characters in a given string.

What is re compile in Python?

Python's re. compile() method is used to compile a regular expression pattern provided as a string into a regex pattern object ( re. Pattern ). Later we can use this pattern object to search for a match inside different target strings using regex methods such as a re. match() or re.search() .


1 Answers

It does, because it is explicit. A space can literally match a space, but it can also be part of the regex in a verbose regular expression and not be meant for matching.

The resulting regex, i guess /a\ b\ c/, is a very explicit regex matching an a followed by a single space, followed by a b, followed by a single space, followed by a c.

If you write it yourself, you could also use /a\sb\sc/ which would match any whitespace between the letters. Or even:

r = re.compile(r"""a #match a
b #match b
c #match c
"""

This last one would be compiled with re.VERBOSE and is a way to write your regex very fine readable in your sourcecode. This regex would ignore the spaces completly and therefore not match your case. With regex, always keep in mind, that everything that is not explicit, will fail some sunday morning at 3am.

like image 153
Oliver Friedrich Avatar answered Oct 04 '22 16:10

Oliver Friedrich