I have a regex:
/abc(def)ghi(jkl)mno(pqr)/igs
How would I capture the results of each parentheses into 3 different variables, one for each parentheses? Right now I using one array to capture all the results, they come out sequential but then I have to parse them and the list could be huge.
@results = ($string =~ /abc(def)ghi(jkl)mno(pqr)/igs);
Instead in Perl, the captured string is stored inside a series of magical variables. The first matching capture is stored into $1, the second one in $2, and so on. Capturing count starts at the opening parenthesis of the capture. Thus making the first left parenthesis to capture into $1, the second one in $2 and so on.
The Special Character Classes in Perl are as follows: Digit \d[0-9]: The \d is used to match any digit character and its equivalent to [0-9]. In the regex /\d/ will match a single digit. The \d is standardized to “digit”.
The '=~' operator is a binary binding operator that indicates the following operation will search or modify the scalar on the left. The default (unspecified) operator is 'm' for match. The matching operator has a pair of characters that designate where the regular expression begins and ends.
To capture all matches to a regex group we need to use the finditer() method. The finditer() method finds all matches and returns an iterator yielding match objects matching the regex pattern. Next, we can iterate each Match object and extract its value.
Your question is a bit ambiguous to me, but I think you want to do something like this:
my (@first, @second, @third);
while( my ($first, $second, $third) = $string =~ /abc(def)ghi(jkl)mno(pqr)/igs) {
push @first, $first;
push @second, $second;
push @third, $third;
}
Starting with 5.10, you can use named capture buffers as well:
#!/usr/bin/perl
use strict; use warnings;
my %data;
my $s = 'abcdefghijklmnopqr';
if ($s =~ /abc (?<first>def) ghi (?<second>jkl) mno (?<third>pqr)/x ) {
push @{ $data{$_} }, $+{$_} for keys %+;
}
use Data::Dumper;
print Dumper \%data;
Output:
$VAR1 = { 'first' => [ 'def' ], 'second' => [ 'jkl' ], 'third' => [ 'pqr' ] };
For earlier versions, you can use the following which avoids having to add a line for each captured buffer:
#!/usr/bin/perl
use strict; use warnings;
my $s = 'abcdefghijklmnopqr';
my @arrays = \ my(@first, @second, @third);
if (my @captured = $s =~ /abc (def) ghi (jkl) mno (pqr) /x ) {
push @{ $arrays[$_] }, $captured[$_] for 0 .. $#arrays;
}
use Data::Dumper;
print Dumper @arrays;
Output:
$VAR1 = [ 'def' ]; $VAR2 = [ 'jkl' ]; $VAR3 = [ 'pqr' ];
But I like keeping related data in a single data structure, so it is best to go back to using a hash. This does require an auxiliary array, however:
my %data;
my @keys = qw( first second third );
if (my @captured = $s =~ /abc (def) ghi (jkl) mno (pqr) /x ) {
push @{ $data{$keys[$_]} }, $captured[$_] for 0 .. $#keys;
}
Or, if the names of the variables really are first
, second
etc, or if the names of the buffers don't matter but only order does, you can use:
my @data;
if ( my @captured = $s =~ /abc (def) ghi (jkl) mno (pqr) /x ) {
push @{ $data[$_] }, $captured[$_] for 0 .. $#captured;
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With