In Python compiled regex patterns have a findall
method that does the following:
Return all non-overlapping matches of pattern in string, as a list of strings. The string is scanned left-to-right, and matches are returned in the order found. If one or more groups are present in the pattern, return a list of groups; this will be a list of tuples if the pattern has more than one group. Empty matches are included in the result unless they touch the beginning of another match.
What's the canonical way of doing this in Perl? A naive algorithm I can think of is along the lines of "while a search and replace with the empty string is successful, do [suite]". I'm hoping there's a nicer way. :-)
Thanks in advance!
But finditer and findall are finding different things. Findall indeed finds all the matches in the given string. But finditer only finds the first one, returning an iterator with only one element.
Finditer method finditer() works exactly the same as the re. findall() method except it returns an iterator yielding match objects matching the regex pattern in a string instead of a list. It scans the string from left to right, and matches are returned in the iterator form.
Regex's findall() function is extremely useful as it returns a list of strings containing all matches. If the pattern is not found, re. findall() returns an empty list.
Above we used re.search() to find the first match for a pattern. findall() finds *all* the matches and returns them as a list of strings, with each string representing one match.
Use the /g
modifier in your match. From the perlop
manual:
The "
/g
" modifier specifies global pattern matching--that is, matching as many times as possible within the string. How it behaves depends on the context. In list context, it returns a list of the substrings matched by any capturing parentheses in the regular expression. If there are no parentheses, it returns a list of all the matched strings, as if there were parentheses around the whole pattern.In scalar context, each execution of "
m//g
" finds the next match, returning true if it matches, and false if there is no further match. The position after the last match can be read or set using thepos()
function; see "pos
" inperlfunc
. A failed match normally resets the search position to the beginning of the string, but you can avoid that by adding the "/c
" modifier (e.g. "m//gc
"). Modifying the target string also resets the search position.
To build on Chris' response, it's probably most relevant to encase the //g
regex in a while
loop, like:
my @matches;
while ( 'foobarbaz' =~ m/([aeiou])/g )
{
push @matches, $1;
}
Pasting some quick Python I/O:
>>> import re
>>> re.findall(r'([aeiou])([nrs])','I had a sandwich for lunch')
[('a', 'n'), ('o', 'r'), ('u', 'n')]
To get something comparable in Perl, the construct could be something like:
my $matches = [];
while ( 'I had a sandwich for lunch' =~ m/([aeiou])([nrs])/g )
{
push @$matches, [$1,$2];
}
But in general, whatever function you're iterating for, you can probably do within the while
loop itself.
Nice beginner reference with similar content to @kyle's answer: Perl Tutorial: Using regular expressions
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With