Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does this Regex only match at the start of the line in Python? [duplicate]

Tags:

python

regex

In Python I can do

import re
re.match("m", "mark")

and I get the expected result:

<_sre.SRE_Match object; span=(0, 1), match='m'>

But it only works if the pattern is at the start of the string:

re.match("m", "amark")

gives None. There is noting about that pattern which requires it to be at the start of the string - no ^ or similar. Indeed it works as expected on regex101.

Does Python have some special behaviour - and how do I disable it please?

like image 852
Mark Smith Avatar asked Dec 09 '15 10:12

Mark Smith


People also ask

What does regex 0 * 1 * 0 * 1 * Mean?

Basically (0+1)* mathes any sequence of ones and zeroes. So, in your example (0+1)*1(0+1)* should match any sequence that has 1. It would not match 000 , but it would match 010 , 1 , 111 etc. (0+1) means 0 OR 1. 1* means any number of ones.

How do you stop greedy in regex?

You make it non-greedy by using ". *?" When using the latter construct, the regex engine will, at every step it matches text into the "." attempt to match whatever make come after the ". *?" . This means that if for instance nothing comes after the ".

How do you repeat a regular expression in Python?

Practical Data Science using Python , '*' or '+' are called repeating character classes. If you repeat a character class by using the '?' , '*' or '+' operators, you will repeat the entire character class, and not just the character that it matched. The regex '[0-9]+' can match '579' as well as '333'.


1 Answers

From the docs on re.match:

If zero or more characters at the beginning of string match the regular expression pattern, return a corresponding match object.

Use re.search to search the entire string.

The docs even grant this issue its own chapter, outlining the differences between the two: search() vs. match()

like image 78
hlt Avatar answered Nov 14 '22 23:11

hlt