Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

perl6 interpolate array in match for AND, OR, NOT functions

I am trying to re-do my program for match-all, match-any, match-none of the items in an array. Some of the documentations on Perl6 don't explain the behavior of the current implementation (Rakudo 2018.04) and I have a few more questions.

(1) Documentation on regex says that interpolating array into match regex means "longest match"; however, this code does not seem to do so:

> my $a="123 ab 4567 cde";
123 ab 4567 cde
> my @b=<23 b cd 567>;
[23 b cd 567]
> say (||@b).WHAT
(Slip)
> say $a ~~ m/ @b /
 「23」    # <=== I expected the match to be "567" (@b[3] matching $a) which is longer than "23";

(2) (||@b) is a Slip; how do I easily do OR or AND of all the elements in the array without explicitly looping through the array?

> say $a ~~ m:g/ @b /
(「23」 「b」 「567」 「cd」)
> say $a ~~ m:g/ ||@b /
(「23」 「b」 「567」 「cd」)
> say $a ~~ m/ ||@b /
 「23」
> say $a ~~ m:g/ |@b /
(「23」 「b」 「567」 「cd」)
> say $a ~~ m:g/ &@b /
(「23」 「b」 「567」 「cd」)
> say $a ~~ m/ &@b /
 「23」
> say $a ~~ m/ &&@b /
 「23」    # <=== && and & don't do the AND function

(3) What I ended up doing is condensing my previous codes into 2 lines:

my $choose = &any; # can prompt for choice of any, one, all, none here;
say so (gather { for @b -> $z { take $a ~~ m/ { say "==>$_ -->$z"; } <{$z}> /; } }).$choose;

output is "True" as expected. But I am hoping a simpler way, without the "gather-take" and "for" loop.

Thank you very much for any insights.

lisprog

like image 289
lisprogtor Avatar asked Jun 13 '18 04:06

lisprogtor


2 Answers

interpolate array in match for AND, OR, NOT functions

I don't know any better solution than Moritz's for AND.

I cover OR below.

One natural way to write a NOT of a list of match tokens would be to use the negated versions of a lookahead or lookbehind assertion, eg:

my $a="123 ab 4567 cde";
my @b=<23 b cd 567>;
say $_>>.pos given $a ~~ m:g/ <!before @b> /;

displays:

(0 2 3 4 6 7 9 10 11 13 14 15)

which is the positions of the 12 matches of not 23, b, cd, or 567 in the string "123 ab 4567 cde", shown by the line of ^s below which point to each of the character positions that matched:

my $a="123 ab 4567 cde";
       ^ ^^^ ^^ ^^^ ^^^
       0123456789012345

I am trying to re-do my program for match-all, match-any, match-none of the items in an array.

These sound junction like and some of the rest of your question is clearly all about junctions. If you linked to your existing program it might make it easier for me/others to see what you're trying to do.

(1)

||@b matches the leftmost matching token in @b, not the longest one.

Write |@b, with a single |, to match the longest matching token in @b. Or, better yet, write just plain @b, which is shorthand for the same thing.

Both of these match patterns (|@b or ||@b), like any other match patterns, are subject to the way the regex engine works, as briefly described by Moritz and in more detail below.

When the regex engine matches a regex against an input string, it starts at the start of the regex and the start of the input string.

If it fails to match, it steps past the first character in the input string, giving up on that character, and instead pretends the input string began at its second character. Then it tries matching again, starting at the start of the regex but the second character of the input string. It repeats this until it either gets to the end of the string or finds a match.

Given your example, the engine fails to match right at the start of 123 ab 4567 cde but successfully matches 23 starting at the second character position. So it's then done -- and the 567 in your match pattern is irrelevant.

One way to get the answer you expected:

my $a="123 ab 4567 cde";
my @b=<23 b cd 567>;

my $longest-overall = '';
sub update-longest-overall ($latest) {
  if $latest.chars > $longest-overall.chars {
    $longest-overall = $latest
  }
}

$a ~~ m:g/ @b { update-longest-overall( $/ ) } /;

say $longest-overall;

displays:

「567」

The use of :g is explained below.

(2)

|@b or ||@b in mainline code mean something completely unrelated to what they mean inside a regex. As you can see, |@b is the same as @b.Slip. ||@b means @b.Slip.Slip which evaluates to @b.Slip.

To do a "parallel" longest-match-pattern-wins OR of the elements of @b, write @b (or |@b) inside a regex.

To do a "sequential" leftmost-match-pattern-wins OR of the elements of @b, write ||@b inside a regex.

I've so far been unable to figure out what & and && do when used to prefix an array in a regex. It looks to me like there are multiple bugs related to their use.

In some of the code in your question you've specified the :g adverb. This directs the engine to not stop when it finds a match but rather to step past the substring it just matched and begin trying to match again further along in the input string.

(There are other adverbs. The :ex adverb is the most extreme. In this case, when there's a match at a given position in the input string, the engine tries to match any other match pattern at the same position in the regex and input string. It keeps doing this no matter how many matches it accumulates until it has tried every last possible match at that position in the regex and input string. Only when it's exhausted all these possibilities does it move forward one character in the input string, and tries exhaustively matching all over again.)

(3)

My best shot:

my $a="123 ab 4567 cde";
my @b=<23 b cd 567>;
my &choose = &any;
say so choose do for @b -> $z {
  $a ~~ / { say "==>$a -->$z"; } $z /
}
like image 64
raiph Avatar answered Nov 17 '22 12:11

raiph


(1) Documentation on regex says that interpolating array into match regex means "longest match"; however, this code does not seem to do so:

The actual rule is that a regex finds the left-most match first, and the longest match second.

However, the left-most rule is true for all regex matches, which is why the regex documentation doesn't explicitly mention it when talking about alternations.

(2) (||@b) is a Slip; how do I easily do OR or AND of all the elements in the array without explicitly looping through the array?

You can always construct a regex as text first:

my $re_text = join '&&', @branches;
my $regex   = re/ <$re_text> /;
like image 33
moritz Avatar answered Nov 17 '22 10:11

moritz