Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can Raku range operator on strings mimic Perl's behaviour?

Tags:

raku

In Perl, the expression "aa" .. "bb" creates a list with the strings:

aa ab ac ad ae af ag ah ai aj ak al am an ao ap aq ar as at au av aw ax ay az ba bb

In Raku, however, (at least with Rakudo v2021.08), the same expression creates:

aa ab ba bb

Even worse, while "12" .. "23" in Perl creates a list of strings with the numbers 12, 13, 14, 15, ..., 23, in Raku the same expression creates the list ("12", "13", "22", "23").

The docs seem to be quite silent about this behaviour; at least, I could not find an explanation there. Is there any way to get Perl's behaviour for Raku ranges?

(I know that the second problem can be solved via typecast to Int. This does not apply to the first problem, though.)

like image 323
Nikola Benes Avatar asked Dec 05 '21 22:12

Nikola Benes


2 Answers

[I'm splitting this into a separate answer because it addresses the "why" instead of the "how"]

I did a bit of digging, and learned that:

  1. For Sequences, having "aa"…"bb" produce "aa", "ab", "ba", "bb" is specified in Roast
  2. The original use case provided for this behavior was generating sequences of octal numbers (as Strs) (discussed again in 2018)
  3. For Ranges, the behavior of "aa".."bb" is currently unspecified and there does not appear to be consensus about what it should be.
  4. (As you already know), Rakudo's implementation has "aa".."bb" behave the same as "aa"…"bb".
  5. In 2018, lizmat ([Elizabeth Mattijsen])https://stackoverflow.com/users/7424470/elizabeth-mattijsen) on StackOverflow) changed .. to make "aa".."bb" behave the way it does in Perl but reverted that change pending consensus on the correct behavior.

So I suppose we (as a community) are still thinking about it? Personally, I'm inclined to agree with lizmat that having "aa".."bb" provide the longer range (like Perl) makes sense: if users want the shorter one, they can use a sequence. (Or, for an octal range, something like (0..0o377).map: *.fmt('%03o'))

But, either way, I definitely agree with that 2018 commit that we should pin this down in Roast – and then get it noted in the docs.

like image 87
codesections Avatar answered Nov 17 '22 22:11

codesections


TL;DR Add one or more extra characters to the endpoint string. It doesn't matter what the character(s) is/are.


10 years after the current doc corpus was kicked started by Moritz Lenz++, Raku's doc is, as ever, a work in progress.

There's a goldmine of more than 16 years worth of chat logs that I sometimes spelunk, looking for answers. A search for range "as words" with nick: TimToady netted me this in a few minutes:

TimToady beginning and ending of the same length now do the specced semantics

considering each position as a separate character range

My instant reaction:

  • Here's why it does what it does. The guy who designed how Perl's range works not only deliberately specced it to work how it now does in Raku but implemented it in Rakudo himself in 2015.

  • It does that iff "beginning and ending of the same length". Hmm. 💡

A few seconds later:

say flat "aa" .. "bb (like perl)";
say flat "12" .. "23 (like perl)";

displays:

(aa ab ac ad ae af ag ah ai aj ak al am an ao ap aq ar as at au av aw ax ay az ba bb)
(12 13 14 15 16 17 18 19 20 21 22 23)

😊

like image 41
raiph Avatar answered Nov 17 '22 22:11

raiph