Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

When is Perl 6's <|w> word boundary not a << word boundary?

Tags:

regex

raku

I have these two bits of code that I thought should be equivalent. The first one uses the <|w> to specify a word boundary where the non-word character (or start of string) should be before H. The second example uses the <<, which should do the same thing.

my $string = 'Hamadryas perlicus';
say $string ~~ /
    <?after <|w> Hamadryas \s+ >
    (\w+)
    /;

say $string ~~ /
    <?after << Hamadryas \s+ >
    (\w+)
    /;

The first one matches but the second one doesn't:

「perlicus」
 0 => 「perlicus」
Nil

Is there some other difference in these two?

like image 424
brian d foy Avatar asked Apr 20 '18 13:04

brian d foy


People also ask

What is non word boundary in regex?

A non-word boundary matches any place else: between any pair of characters, both of which are word characters or both of which are not word characters. at the beginning of a string if the first character is a non-word character. at the end of a string if the last character is a non-word character.

How does word boundary work in regex?

Word Boundary: \b The word boundary \b matches positions where one side is a word character (usually a letter, digit or underscore—but see below for variations across engines) and the other side is not a word character (for instance, it may be the beginning of the string or a space character).

What characters are word boundaries in regex?

The metacharacter \b is an anchor like the caret and the dollar sign. It matches at a position that is called a “word boundary”.

What is the difference between \b and \b in regular expression?

Using regex \B-\B matches - between the word color - coded . Using \b-\b on the other hand matches the - in nine-digit and pass-key .


1 Answers

This answer by timotimo in the IRC channel gives a hint of why that's happening that way. When you're using after, you're actually flipping the regular expression. You'll then have to flip right for left, and that will work.

use v6;

my $string = 'Hamadryas perlicus';
say $string ~~ /
    <?after  Hamadryas <|w> \s+ >
    (\w+)
    /;

say $string ~~ /
    <?after Hamadryas « \s+ >
    (\w+)
    /;

That will yield what you are looking for.

like image 164
jjmerelo Avatar answered Dec 03 '22 22:12

jjmerelo