I'm using the finditer
function in the re
module to match some things and everything is working.
Now I need to find out how many matches I've got. Is it possible without looping through the iterator twice? (one to find out the count and then the real iteration)
Some code:
imageMatches = re.finditer("<img src\=\"(?P<path>[-/\w\.]+)\"", response[2])
# <Here I need to get the number of matches>
for imageMatch in imageMatches:
doStuff
Everything works, I just need to get the number of matches before the loop.
To count the number of regex matches, call the match() method on the string, passing it the regular expression as a parameter, e.g. (str. match(/[a-z]/g) || []). length . The match method returns an array of the regex matches or null if there are no matches found.
To count a regex pattern multiple times in a given string, use the method len(re. findall(pattern, string)) that returns the number of matching substrings or len([*re. finditer(pattern, text)]) that unpacks all matching substrings into a list and returns the length of it as well.
Occurrence Indicators (or Repetition Operators): +: one or more ( 1+ ), e.g., [0-9]+ matches one or more digits such as '123' , '000' . *: zero or more ( 0+ ), e.g., [0-9]* matches zero or more digits. It accepts all those in [0-9]+ plus the empty string.
Definition and Usage. The [0-9] expression is used to find any character between the brackets. The digits inside the brackets can be any numbers or span of numbers from 0 to 9. Tip: Use the [^0-9] expression to find any character that is NOT a digit.
If you know you will want all the matches, you could use the re.findall
function. It will return a list of all the matches. Then you can just do len(result)
for the number of matches.
If you always need to know the length, and you just need the content of the match rather than the other info, you might as well use re.findall
. Otherwise, if you only need the length sometimes, you can use e.g.
matches = re.finditer(...)
...
matches = tuple(matches)
to store the iteration of the matches in a reusable tuple. Then just do len(matches)
.
Another option, if you just need to know the total count after doing whatever with the match objects, is to use
matches = enumerate(re.finditer(...))
which will return an (index, match)
pair for each of the original matches. So then you can just store the first element of each tuple in some variable.
But if you need the length first of all, and you need match objects as opposed to just the strings, you should just do
matches = tuple(re.finditer(...))
#An example for counting matched groups
import re
pattern = re.compile(r'(\w+).(\d+).(\w+).(\w+)', re.IGNORECASE)
search_str = "My 11 Char String"
res = re.match(pattern, search_str)
print(len(res.groups())) # len = 4
print (res.group(1) ) #My
print (res.group(2) ) #11
print (res.group(3) ) #Char
print (res.group(4) ) #String
If you find you need to stick with finditer()
, you can simply use a counter while you iterate through the iterator.
Example:
>>> from re import *
>>> pattern = compile(r'.ython')
>>> string = 'i like python jython and dython (whatever that is)'
>>> iterator = finditer(pattern, string)
>>> count = 0
>>> for match in iterator:
count +=1
>>> count
3
If you need the features of finditer()
(not matching to overlapping instances), use this method.
I know this is a little old, but this but here is a concise function for counting regex patterns.
def regex_cnt(string, pattern):
return len(re.findall(pattern, string))
string = 'abc123'
regex_cnt(string, '[0-9]')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With