Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python regex - difference between search and find all

Tags:

python

regex

I am trying to use python regex on a URL string.

id= 'edu.vt.lib.scholar:http/ejournals/VALib/v48_n4/newsome.html'
>>> re.search('news|ejournals|theses',id).group()
'ejournals'
>>> re.findall('news|ejournals|theses',id)
['ejournals', 'news']

Based on the docs at http://docs.python.org/2/library/re.html#finding-all-adverbs, it says search() matches the first one and find all matches all the possible ones in the string.

I am wondering why 'news' is not captured with search even though it is declared first in the pattern.

Did i use the wrong pattern ? I want to search if any of those keywords occur in the string.

like image 846
kich Avatar asked Feb 26 '13 21:02

kich


People also ask

What is the difference between search and find in Python?

findall() helps to get a list of all matching patterns. It searches from start or end of the given string. If we use method findall to search for a pattern in a given string it will return all occurrences of the pattern. While searching a pattern, it is recommended to use re.

What is the difference between search and match in Python regex?

Python offers two different primitive operations based on regular expressions: match checks for a match only at the beginning of the string, while search checks for a match anywhere in the string (this is what Perl does by default).

Is there any difference between re match () and re search () in the Python re module?

There is a difference between the use of both functions. Both return the first match of a substring found in the string, but re. match() searches only from the beginning of the string and return match object if found.

What is use of search match and Findall method?

findall() is probably the single most powerful function in the re module. Above we used re.search() to find the first match for a pattern. findall() finds *all* the matches and returns them as a list of strings, with each string representing one match.


2 Answers

You're thinking about it backwards. The regex goes through the target string looking for "news" OR "ejournals" OR "theses" and returns the first one it finds. In this case "ejournals" appears first in the target string.

like image 185
Joel Cornett Avatar answered Oct 01 '22 21:10

Joel Cornett


The re.search() function stops after the first occurrence that satisfies your condition, not the first option in the pattern.

like image 20
Nisan.H Avatar answered Oct 01 '22 20:10

Nisan.H