I want to extract the row key(here is 28_2820201112122420516_000000
), the column name(here is bcp_startSoc
), and the value(here is 64.0
) in $str
, where $str
is a row from HBase:
# `match` is OK
my $str = '28_2820201112122420516_000000 column=d:bcp_startSoc, timestamp=1605155065124, value=64.0';
my $match = $str.match(/^ ([\d+]+ % '_') \s 'column=d:' (\w+) ',' \s timestamp '=' \d+ ',' \s 'value=' (<-[=]>+) $/);
my @match-result = $match».Str.Slip;
say @match-result; # Output: [28_2820201112122420516_000000 bcp_startSoc 64.0]
# `smartmatch` is OK
# $str ~~ /^ ([\d+]+ % '_') \s 'column=d:' (\w+) ',' \s timestamp '=' \d+ ',' \s 'value=' (<-[=]>+) $/
# say $/».Str.Array; # Output: [28_2820201112122420516_000000 bcp_startSoc 64.0]
# `comb` is NOT OK
# A <( token indicates the start of the match's overall capture, while the corresponding )> token indicates its endpoint.
# The <( is similar to other languages \K to discard any matches found before the \K.
my @comb-result = $str.comb(/<( [\d+]+ % '_' )> \s 'column=d:' <(\w+)> ',' \s timestamp '=' \d+ ',' \s 'value=' <(<-[=]>+)>/);
say @comb-result; # Expect: [28_2820201112122420516_000000 bcp_startSoc 64.0], but got [64.0]
I want comb
to skip some matches, and just match what i wanted, so i use multiple <(
and )>
here, but only get the last match as result.
Is it possible to use comb
to get the same result as match
method?
TL;DR Multiple <(...)>
s don't mean multiple captures. Even if they did, .comb
reduces each match to a single string in the list of strings it returns. If you really want to use .comb
, one way is to go back to your original regex but also store the desired data using additional code inside the regex.
<(...)>
s don't mean multiple capturesThe default start point for the overall match of a regex is the start of the regex. The default end point is the end.
Writing <(
resets the start point for the overall match to the position you insert it at. Each time you insert one and it gets applied during processing of a regex it resets the start point. Likewise )>
resets the end point. At the end of processing a regex the final settings for the start and end are applied in constructing the final overall match.
Given that your code just unconditionally resets each point three times, the last start and end resets "win".
.comb
reduces each match to a single stringfoo.comb(/.../)
is equivalent to foo.match(:g, /.../)>>.Str;
.
That means you only get one string for each match against the regex.
One possible solution is to use the approach @ohmycloudy shows in their answer.
But that comes with the caveats raised by myself and @jubilatious1 in comments on their answer.
{ @comb-result .push: |$/».Str }
to the regexYou can workaround .comb
's normal functioning. I'm not saying it's a good thing to do. Nor am I saying it's not. You asked, I'm answering, and that's it. :)
Start with your original regex that worked with your other solutions.
Then add { @comb-result .push: |$/».Str }
to the end of the regex to store the result of each match. Now you will get the result you want.
$str.comb( / ^ [\d+]+ % '_' | <?after d\:> \w+ | <?after value\=> .*/ )
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With