Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python regular expression "\1"

Tags:

python

regex

Can anyone tell me what does "\1" mean in the following regular expression in Python?

re.sub(r'(\b[a-z]+) \1', r'\1', 'cat in the the hat') 
like image 324
Mengwen Avatar asked Dec 27 '13 14:12

Mengwen


People also ask

What is ?: In regex?

'a' (which in this case ?: is doing it is matching with a string but it is excluding whatever comes after it means it will match the string but not whitespace(taking into account match(numbers or strings) not additional things with them.)

How do you use special characters in regex Python?

To match a character having special meaning in regex, you need to use a escape sequence prefix with a backslash ( \ ). E.g., \. matches "." ; regex \+ matches "+" ; and regex \( matches "(" . You also need to use regex \\ to match "\" (back-slash).

What does * do in regex?

The Match-zero-or-more Operator ( * ) This operator repeats the smallest possible preceding regular expression as many times as necessary (including zero) to match the pattern. `*' represents this operator. For example, `o*' matches any string made up of zero or more `o' s.


2 Answers

\1 is equivalent to re.search(...).group(1), the first parentheses-delimited expression inside of the regex.

It's also, fun fact, part of the reason that regular expressions are significantly slower in Python and other programming languages than required to be by CS theory.

like image 63
Patrick Collins Avatar answered Sep 19 '22 03:09

Patrick Collins


The first \1 means the first group - i.e. the first bracketed expression (\b[a-z]+)

From the docs \number

"Matches the contents of the group of the same number. Groups are numbered starting from 1. For example, (.+) \1 matches 'the the' or '55 55', but not 'thethe' (note the space after the group)"

In your case it is looking for a repeated "word" (well, block of lower case letters).

The second \1 is the replacement to use in case of a match, so a repeated word will be replaced by a single word.

like image 29
doctorlove Avatar answered Sep 23 '22 03:09

doctorlove