From the regex docs it says that:
Pattern.match(...)
If zero or more characters at the beginning of string match this regular expression
Pattern.fullmatch(...)
If the whole string matches this regular expression
Pattern.search(...)
Scan through string looking for the first location where this regular expression produces a match
Given the above, why couldn't someone just always use search
to do everything? For example:
re.search(r'...' # search
re.search(r'^...' or re.search(r'\A...' # match
re.search(r'^...$' or re.search(r'\A...\Z' # fullmatch
Are match
and fullmatch
just shortcuts (if they could be called that) for the search
method? Or do they have other uses that I'm overlooking?
re.search searches for the pattern throughout the string, whereas re. match does not search the pattern; if it does not, it has no other choice than to match it at start of the string.
Python offers two different primitive operations based on regular expressions: match checks for a match only at the beginning of the string, while search checks for a match anywhere in the string (this is what Perl does by default).
match() function. When provided with a regular expression, the re. match() function checks the string to be matched for a pattern in the RegEx and returns the first occurrence of such a pattern match. This function only checks for a match at the beginning of the string.
The re.search() function will search the regular expression pattern and return the first occurrence. Unlike Python re. match(), it will check all lines of the input string. If the pattern is found, the match object will be returned, otherwise “null” is returned.
Giving credit for @Ruzihm's answer since parts of my answer derive from his.
A quick rundown of the differences:
re.match
is anchored at the start ^pattern
re.fullmatch
is anchored at the start and end of the pattern ^pattern$
re.search
is not anchored pattern
A more in-depth comparison of re.match
vs re.search
can be found here
With examples:
aa # string
a|aa # regex
re.match: a
re.search: a
re.fullmatch: aa
ab # string
^a # regex
re.match: a
re.search: a
re.fullmatch: # None (no match)
\A
and \Z
anchors?The documentation states the following:
Python offers two different primitive operations based on regular expressions:
re.match()
checks for a match only at the beginning of the string, whilere.search()
checks for a match anywhere in the string (this is what Perl does by default).
And in the Pattern.fullmatch
section it says:
If the whole string matches this regular expression, return a corresponding match object.
And, as initially found and quoted by Ruzihm in his answer:
Note however that in MULTILINE mode match() only matches at the beginning of the string, whereas using search() with a regular expression beginning with
^
will match at the beginning of each line.>>> re.match('X', 'A\nB\nX', re.MULTILINE) # No match >>> re.search('^X', 'A\nB\nX', re.MULTILINE) # Match <re.Match object; span=(4, 5), match='X'>
\A^A
B
X$\Z
# re.match('X', s) no match
# re.search('^X', s) no match
# ------------------------------------------
# and the string above when re.MULTILINE is enabled effectively becomes
\A^A$
^B$
^C$\Z
# re.match('X', s, re.MULTILINE) no match
# re.search('^X', s, re.MULTILINE) match X
With regards to \A
and \Z
, neither performs differently for re.MULTILINE
since \A
and \Z
are effectively the only ^
and $
in the whole string.
So using \A
and \Z
with any of the three methods yields the same results.
What this tells me is that re.match
and re.fullmatch
don't match line anchors ^
and $
respectively, but that they instead match string anchors \A
and \Z
respectively.
re.search
calls that start with \A
or start with \A
and end with \Z
.Because \A
always specifies the beginning of the string, using re.search
and prepending \A
seems to equate re.match
, even under MULTILINE mode. Some examples:
import re
haystack = "A\nB\nZ"
matchstring = 'A'
x=re.match(matchstring, haystack) # Match
y=re.search('\A' + matchstring, haystack) # Match
matchstring = 'A$\nB'
x=re.match(matchstring, haystack, re.MULTILINE) # Match
y=re.search('\A' + matchstring, haystack, re.MULTILINE) # Match
matchstring = 'A\n$B'
x=re.match(matchstring, haystack, re.MULTILINE) # No match
y=re.search('\A' + matchstring, haystack, re.MULTILINE) # No match
The same is true for putting the search string between \A
and \Z
to equate fullmatch
.
\A
/ \Z
:No, they treat MULTILINE differently. From the documentation:
Note however that in MULTILINE mode
match()
only matches at the beginning of the string, whereas usingsearch()
with a regular expression beginning with'^'
will match at the beginning of each line....
>>> re.match('X', 'A\nB\nX', re.MULTILINE) # No match >>> re.search('^X', 'A\nB\nX', re.MULTILINE) # Match <re.Match object; span=(4, 5), match='X'>
Likewise, in MULTILINE mode, fullmatch()
matches at the beginning and end of the string, and search()
with '^...$'
matches at the beginning and end of each line.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With