I have text like this:
00:00 stuff 00:01 more stuff multi line and going 00:02 still have
So, I don't have a block end, just a new block start.
I want to recursively get all blocks:
1 = 00:00 stuff 2 = 00:01 more stuff multi line and going
etc
The bellow code only gives me this:
$VAR1 = '00:00';
$VAR2 = '';
$VAR3 = '00:01';
$VAR4 = '';
$VAR5 = '00:02';
$VAR6 = '';
What am I doing wrong?
my $text = '00:00 stuff
00:01 more stuff
multi line
and going
00:02 still
have
';
my @array = $text =~ m/^([0-9]{2}:[0-9]{2})(.*?)/gms;
print Dumper(@array);
Version 5.10.0 introduced named capture groups that are useful for matching nontrivial patterns.
(?'NAME'pattern)
(?<NAME>pattern)
A named capture group. Identical in every respect to normal capturing parentheses
()
but for the additional fact that the group can be referred to by name in various regular expression constructs (such as\g{NAME}
) and can be accessed by name after a successful match via%+
or%-
. See perlvar for more details on the%+
and%-
hashes.If multiple distinct capture groups have the same name then the
$+{NAME}
will refer to the leftmost defined group in the match.The forms
(?'NAME'pattern)
and(?<NAME>pattern)
are equivalent.
Named capture groups allow us to name subpatterns within the regex as in the following.
use 5.10.0; # named capture buffers
my $block_pattern = qr/
(?<time>(?&_time)) (?&_sp) (?<desc>(?&_desc))
(?(DEFINE)
# timestamp at logical beginning-of-line
(?<_time> (?m:^) [0-9][0-9]:[0-9][0-9])
# runs of spaces or tabs
(?<_sp> [ \t]+)
# description is everything through the end of the record
(?<_desc>
# s switch makes . match newline too
(?s: .+?)
# terminate before optional whitespace (which we remove) followed
# by either end-of-string or the start of another block
(?= (?&_sp)? (?: $ | (?&_time)))
)
)
/x;
Use it as in
my $text = '00:00 stuff
00:01 more stuff
multi line
and going
00:02 still
have
';
while ($text =~ /$block_pattern/g) {
print "time=[$+{time}]\n",
"desc=[[[\n",
$+{desc},
"]]]\n\n";
}
Output:
$ ./blocks-demo time=[00:00] desc=[[[ stuff ]]] time=[00:01] desc=[[[ more stuff multi line and going ]]] time=[00:02] desc=[[[ still have ]]]
This should do the trick. Beginning of next \d\d:\d\d is treated as block end.
use strict;
my $Str = '00:00 stuff
00:01 more stuff
multi line
and going
00:02 still
have
00:03 still
have' ;
my @Blocks = ($Str =~ m#(\d\d:\d\d.+?(?:(?=\d\d:\d\d)|$))#gs);
print join "--\n", @Blocks;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With