Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why do unicode quotes appear around a regex capture in perl6?

Tags:

regex

raku

I'm using rakudo, and the following code:

"foo" ~~ m/(foo)/;
say $0;

I thought the output would be:

foo

However, I get:

「foo」

(That's foo with some weird unicode-y quote marks around it.)

I cannot find anything about this in the documentation, and I can't seem to get rid of those quotes. What's happening here?

Edit: Doing

say "$0";

instead gets rid of the quote marks, and both of

print $0;
print "$0";

do too. So I guess the capture isn't actually a string, and putting double quotes around it somehow turns it into a string? (By the way, $0.gist produces 「foo」, not foo.) Can anyone point me to the part of the documentation where I can learn about this behavior? I'm coming from Perl, and thoroughly confused.

like image 632
Betta George Avatar asked Jan 04 '18 21:01

Betta George


2 Answers

A capture returns a Match which stringifies to the matched string as you discovered.

Grouping and Capturing says

An unquantified capture produces a Match object.

BTW, You can see what type the variable actually holds with .WHAT:

say $0.WHAT;
(Match)
like image 104
Curt Tilmes Avatar answered Nov 14 '22 04:11

Curt Tilmes


The say sub calls the .gist method. By contrast, the print sub calls the .Str method. There's also a put sub ("print using terminator"), which calls .Str and then does a newline. That's probably what you want to be using instead of say.

The .gist and .Str methods are two different ways to turn an object into a Str. The .gist method provides a human-friendly representation of the data that conveys its structure. If you .gist a complex Match whith a bunch of captures, it will show those (and use indentation to display the match tree). By contrast, .Str doesn't try to reproduce structure; on a Match object, it just gives the text that the Match covers.

So, to summarize the differences between the Perl 5 and Perl 6 languages that you're running into:

  • Captures are Match objects, not strings (which is why grammars can produce a parse tree)
  • The say function in Perl 6 calls .gist
  • The put function in Perl 6 is mostly equivalent to the say function in Perl 5

Finally, the square quotes were picked because they are relatively rare, and thus unlikely to be in any user data, and therefore allow a presentation of the captured data that is very unlikely to need any escape sequences in it. That provides a more easily readable overview of the Match in question, which is the aim of .gist.

like image 23
Jonathan Worthington Avatar answered Nov 14 '22 03:11

Jonathan Worthington