Update: Corrected code added below
I have a Leanpub flavored markdown* file named sample.md
I'd like to convert its code blocks into Github flavored markdown style using Raku Regex
Here's a sample **ruby** code, which
prints the elements of an array:
{:lang="ruby"}
['Ian','Rich','Jon'].each {|x| puts x}
Here's a sample **shell** code, which
removes the ending commas and
finds all folders in the current path:
{:lang="shell"}
sed s/,$//g
find . -type d
In order to capture the lang
value, e.g. ruby
from the {:lang="ruby"}
and convert it into
```ruby
I use this code
my @in="sample.md".IO.lines;
my @out;
for @in.kv -> $key,$val {
if $val.starts-with("\{:lang") {
if $val ~~ /^{:lang="([a-z]+)"}$/ { # capture lang
@out[$key]="```$0"; # convert it into ```ruby
$key++;
while @in[$key].starts-with(" ") {
@out[$key]=@in[$key].trim-leading;
$key++;
}
@out[$key]="```";
}
}
@out[$key]=$val;
}
The line containing the Regex gives Cannot modify an immutable Pair (lang => True) error.
I've just started out using Regexes. Instead of ([a-z]+)
I've tried (\w)
and it gave the Unrecognized backslash sequence: '\w'
error, among other things.
How to correctly capture and modify the lang
value using Regex?
my @in="sample.md".IO.lines;
my \[email protected];
my @out;
my $k = 0;
while ($k < len) {
if @in[$k] ~~ / ^ '{:lang="' (\w+) '"}' $ / {
push @out, "```$0";
$k++;
while @in[$k].starts-with(" ") {
push @out, @in[$k].trim-leading;
$k++; }
push @out, "```";
}
push @out, @in[$k];
$k++;
}
for @out {print "$_\n"}
This one-liner seems to solve the problem:
say S:g /\{\: "lang" \= \" (\w+) \" \} /```$0/ given "text.md".IO.slurp;
Let's try and explain what was going on, however. The error was a regular expression grammar error, caused by having a :
being followed by a name, and all that inside a curly. {}
runs code inside a regex. Raiph's answer is (obviously) correct, by changing it to a Perl regular expression. But what I've done here is to change it to a Raku's non-destructive substitution, with the :g
global flag, to make it act on the whole file (slurped at the end of the line; I've saved it to a file called text.md
). So what this does is to slurp your target file, with given
it's saved in the $_
topic variable, and printed once the substitution has been made. Good thing is if you want to make more substitutions you can shove another such expression to the front, and it will act on the output.
Using this kind of expression is always going to be conceptually simpler, and possibly faster, than dealing with a text line by line.
TL;DR
TL? Then read @jjemerelo's excellent answer which not only provides a one-line solution but much more in a compact form ;
DR? Aw, imo you're missing some good stuff in this answer that JJ (reasonably!) ignores. Though, again, JJ's is the bomb. Go read it first. :)
There are many dialects of regex. The regex pattern you've used is a Perl regex but you haven't told Raku that. So it's interpreting your regex as a Raku regex, not a Perl regex. It's like feeding Python code to perl
. So the error message is useless.
One option is to switch to Perl regex handling. To do that, this code:
/^{:lang="([a-z]+)"}$/
needs m :P5
at the start:
m :P5 /^{:lang="([a-z]+)"}$/
The m
is implicit when you use /.../
in a context where it is presumed you mean to immediately match, but because the :P5
"adverb" is being added to modify how Raku interprets the pattern in the regex, one has to also add the m
.
:P5
only supports a limited set of Perl's regex patterns. That said, it should be enough for the regex you've written in your question.
If you want to use a Raku regex you have to learn the Raku regex language.
The "spirit" of the Raku regex language is the same as Perl's, and some of the absolute basic syntax is the same as Perl's, but it's different enough that you should view it as yet another dialect of regex, just one that's generally "powered up" relative to Perl's regexes.
To rewrite the regex in Raku format I think it would be:
/ ^ '{:lang="' (<[a..z]>+) '"}' $ /
(Taking advantage of the fact whitespace in Raku regexes is ignored.)
After fixing the regex, one encounters other problems in your code.
The first problem I encountered is that $key
is read-only, so $key++
fails. One option is to make it writable, by writing -> $key is copy ...
, which makes $key
a read-write copy of the index passed by the .kv
.
But fixing that leads to another problem. And the code is so complex I've concluded I'd best not chase things further. I've addressed your immediate obstacle and hope that helps.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With