I am trying to use python regex on a URL string.
id= 'edu.vt.lib.scholar:http/ejournals/VALib/v48_n4/newsome.html'
>>> re.search('news|ejournals|theses',id).group()
'ejournals'
>>> re.findall('news|ejournals|theses',id)
['ejournals', 'news']
Based on the docs at http://docs.python.org/2/library/re.html#finding-all-adverbs, it says search() matches the first one and find all matches all the possible ones in the string.
I am wondering why 'news' is not captured with search even though it is declared first in the pattern.
Did i use the wrong pattern ? I want to search if any of those keywords occur in the string.
findall() helps to get a list of all matching patterns. It searches from start or end of the given string. If we use method findall to search for a pattern in a given string it will return all occurrences of the pattern. While searching a pattern, it is recommended to use re.
Python offers two different primitive operations based on regular expressions: match checks for a match only at the beginning of the string, while search checks for a match anywhere in the string (this is what Perl does by default).
There is a difference between the use of both functions. Both return the first match of a substring found in the string, but re. match() searches only from the beginning of the string and return match object if found.
findall() is probably the single most powerful function in the re module. Above we used re.search() to find the first match for a pattern. findall() finds *all* the matches and returns them as a list of strings, with each string representing one match.
You're thinking about it backwards. The regex goes through the target string looking for "news"
OR "ejournals"
OR "theses"
and returns the first one it finds. In this case "ejournals"
appears first in the target string.
The re.search()
function stops after the first occurrence that satisfies your condition, not the first option in the pattern.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With