Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What's the difference between $/ and $¢ in regex?

Tags:

regex

raku

As the title indicates, what is the difference between $/ and ? They appear to always have the same value:

my $text = "Hello world";

$text ~~ /(\w+) { say $/.raku } (\w+)/;
$text ~~ /(\w+) { say $¢.raku } (\w+)/;

Both result in Match objects with the same values. What's the logic in using one over the other?

like image 758
user0721090601 Avatar asked Apr 27 '20 03:04

user0721090601


People also ask

What is difference [] and () in regex?

In other words, square brackets match exactly one character. (a-z0-9) will match two characters, the first is one of abcdefghijklmnopqrstuvwxyz , the second is one of 0123456789 , just as if the parenthesis weren't there. The () will allow you to read exactly which characters were matched.

What does '$' mean in regex?

$ means "Match the end of the string" (the position after the last character in the string).

What does $1 do in regex?

For example, the replacement pattern $1 indicates that the matched substring is to be replaced by the first captured group.

What does regex 0 * 1 * 0 * 1 * Mean?

Basically (0+1)* mathes any sequence of ones and zeroes. So, in your example (0+1)*1(0+1)* should match any sequence that has 1. It would not match 000 , but it would match 010 , 1 , 111 etc. (0+1) means 0 OR 1.


1 Answers

The variable $/ refers to the most recent match while the variable refers to the most recent outermost match. In most basic regexes like the above, that may be one and the same. But as can be seen from the output of the .raku method, Match objects can contain other Match objects (that's what you get when you use $<foo> or $1 for captures).

Suppose instead we had the following regex with a quantified capture

/ ab (cd { say $¢.from, " ", $¢.to } ) + /

And ran it would see the following output if we matched against "abcdcdcd":

0 2
0 4
0 6

But if we change from using to $/, we get a different result:

2 2
4 4
6 6

(The reason the .to seems to be a bit off is that it —and .pos— are not updated until the end of the capture block.)

In other words, will always refer to what will be your final match object (i.e., $final = $text ~~ $regex) so you can traverse a complex capture tree inside of the regex exactly as you would after having finished the full match So in the above example, you could just do $¢[0] to refer to the first match, $¢[1] the second, etc.

Inside of a regex code block, $/ will refer to the most immediate match. In the above case, that's the match for inside the ( ) and won't know about the other matches, nor the original start of the matching: just the start for the ( ) block. So give a more complex regex:

/ a $<foo>=(b $<bar>=(c)+ )+ d /

We can access at any point using $¢ all of the foo tokens by saying $¢<foo>. We can access the bar tokens of a given foo by using $¢<foo>[0]<bar>. If we insert a code block inside of foo's capture, it will be able to access bar tokens by using $<bar> or $/<bar>, but it won't be able to access other foos.

like image 191
user0721090601 Avatar answered Oct 31 '22 07:10

user0721090601