Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What does the "r" in pythons re.compile(r' pattern flags') mean?

Tags:

python

regex

I am reading through http://docs.python.org/2/library/re.html. According to this the "r" in pythons re.compile(r' pattern flags') refers the raw string notation :

The solution is to use Python’s raw string notation for regular expression patterns; backslashes are not handled in any special way in a string literal prefixed with 'r'. So r"\n" is a two-character string containing '\' and 'n', while "\n" is a one-character string containing a newline. Usually patterns will be expressed in Python code using this raw string notation.

Would it be fair to say then that:

re.compile(r pattern) means that "pattern" is a regex while, re.compile(pattern) means that "pattern" is an exact match?

like image 901
user1592380 Avatar asked Jan 14 '14 01:01

user1592380


People also ask

WHAT IS RE pattern in Python?

Regular expressions, called regexes for short, are descriptions for a pattern of text. For example, a \d in a regex stands for a digit character — that is, any single numeral 0 to 9. Following regex is used in Python to match a string of three numbers, a hyphen, three more numbers, another hyphen, and four numbers.

Why do we use re compile () method in Python regular expression?

re. compile(pattern, repl, string): We can combine a regular expression pattern into pattern objects, which can be used for pattern matching. It also helps to search a pattern again without rewriting it.

What is pattern matching in R?

R Functions for Pattern MatchingIf the regular expression, pattern, matches a particular element in the vector string, it returns the element's index. For returning the actual matching element values, set the option value to TRUE by value=TRUE .


1 Answers

As @PauloBu stated, the r string prefix is not specifically related to regex's, but to strings generally in Python.

Normal strings use the backslash character as an escape character for special characters (like newlines):

>>> print('this is \n a test') this is   a test 

The r prefix tells the interpreter not to do this:

>>> print(r'this is \n a test') this is \n a test >>>  

This is important in regular expressions, as you need the backslash to make it to the re module intact - in particular, \b matches empty string specifically at the start and end of a word. re expects the string \b, however normal string interpretation '\b' is converted to the ASCII backspace character, so you need to either explicitly escape the backslash ('\\b'), or tell python it is a raw string (r'\b').

>>> import re >>> re.findall('\b', 'test') # the backslash gets consumed by the python string interpreter [] >>> re.findall('\\b', 'test') # backslash is explicitly escaped and is passed through to re module ['', ''] >>> re.findall(r'\b', 'test') # often this syntax is easier ['', ''] 
like image 63
Peter Gibson Avatar answered Oct 02 '22 01:10

Peter Gibson