Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Raku: return Type

Tags:

raku

I want to write a function returning an array whose all subarrays must have a length of two. For example return will be [[1, 2], [3, 4]].

I define:

(1) subset TArray of Array where { .all ~~ subset :: where [Int, Int] };

and

sub fcn(Int $n) of TArray is export(:fcn) {
    [[1, 2], [3, 4]];
}

I find (1) over-complicated. Is there something simpler?

like image 676
user3166747 Avatar asked Sep 04 '21 07:09

user3166747


2 Answers

Stepping back first

subset TArray of Array where { .all ~~ subset :: where [Int, Int] };

Is there something simpler?

Before we go there, let's step back. Even ignoring your code's "overly-complicated" nature based on just looking at it, it's also potentially problematic and complicated for various reasons that may not be so obvious. I'll highlight three:

  • This subset will accept an Array containing Arrays, with each of those arrays containing two Ints. But it doesn't mandate an Array[Array[Int]]. The outer Array's type may be just a generic Array, rather than being an Array[Array] let lone an Array[Array[Int]]. Indeed it will be unless you deliberately introduce strongly typed values. I will cover strong typing in the last section of this answer.

  • What about an empty Array? Your subset will accept that. Is that your intent? If not, what about requiring at least one pair of Ints?

  • The outer where clause uses a common Raku idiom of the form .all ~~ ..., with a junction on the left hand side of the ~~ smart match operator. Astonishingly, per an issue I just filed, this may be a problem. What alternatives are there?

Starting simple

Raku does a decent job of keeping simple things simple. If we put aside any artificial desire for strong typing, and focus on simple tools for tightening code up, a simple subset I would have suggested in the past would be:

subset TArray where .all == 2; # BAD despite being idiomatic???

This has all of the problems your original code has, plus in addition it accepts data that has non-integers where integers belong.

But it does have the redeeming qualities that it does a useful check (that the inner arrays each have two elements) and it's significantly simpler than your code.

Now I've reminded myself that I need to view .all on the left hand side of ~~ as possibly a problem, I'll instead write it as:

subset TArray where 2 == .all; # Potentially the new idiomatic.

This version reads more poorly, but, while readability is important, basic correctness is more important.

Still fairly simple, and less problems

Here are two variants I came up with:

subset TArray where all .map: * ~~ (Int,Int);
subset TArray where .elems == .grep: (Int,Int);

These both avoid the junction/smartmatch problem. (The first where expression does have a junction to the left of a smart match, but it's not an example of the problem.)

The second version isn't so obviously correct (think of it as checking that the count of subarrays is the same as the count of subarrays that match (Int,Int)) but it nicely lends itself to fixing the problem of matching if there are zero subarrays, if that were to need fixing:

subset TArray where 0 < .elems == .grep: (Int,Int);

Strong typing solutions

The solutions thus far don't deal with strong typing. Perhaps that's desirable. Perhaps not.

To understand what I mean by this, let's first look at literals:

say WHAT 1;             # (Int)
say WHAT [1,2];         # (Array)
say WHAT [[1,2],[3,4]]; # (Array)
  • These values have types determined by their literal constructors.

  • The last two are just Arrays, generic over their elements.

    (The second is not an Array[Int], which might be expected. Similarly the last one is not an Array[Array[Int]].)

    Current built in Raku literal forms for composite types (arrays and hashes) all construct generic Arrays which do not constrain the types of their elements.

    See the PR Introduce [1,2,3]:Int syntax #4406 for a proposal/PR regarding element typed composite literals and a related issue I just posted in response to your Q here about an alternative and/or complementary approach to that PR. (There have been discussions over the years about this aspect of the type system but it seems like it's time for Rakoons to look at addressing it.)

What if you wanted to build a strongly typed data structure as the value to return from your routine, and to have the return type check that?


Here's one way one might build such a strongly typed value:

my Array[Array[Int]] $result .= new: Array[Int].new(1,2), Array[Int].new(3,4);

Super verbose! But now you could write the following for your sub's return type check and it'll work:

subset TArray of Array[Array[Int]] where 0 < .elems == .grep: (Int,Int);

sub fcn(Int $n) of TArray is export(:fcn) {
  my Array[Array[Int]] $result .= new: Array[Int].new(1,2), Array[Int].new(3,4);
}

Another way to build a strongly typed value is to specify not only the strong typing in a variable's type constraint, but also coercion typing to bridge from a loosely typed value to a strongly typed target.

We keep the exact same subset (that establishes the strongly typed target data structure and adds "refinement typing" checks):

subset TArray of Array[Array[Int]] where 0 < .elems == .grep: (Int,Int);

But instead of using a verbose correct-by-construction initialization value, using full type names and news, we introduce additional coercion typing and then just use ordinary literal syntax:

constant TArrayInitialization = TArray(Array[Array[Int]()]());

sub fcn(Int $n) of TArray is export(:fcn) {
  my TArrayInitialization $result = [[1,2],[3,4]];
}

(I could have written the TArrayInitialization declaration as another subset, but it would be a slight overkill to have done so. A constant does the job with less fuss.)

like image 159
raiph Avatar answered Nov 08 '22 03:11

raiph


I gather that the aim is to restrict the type of the inner Array to [Int,Int] ... the closest I can get to this is to declare two subsets, one based on the other...

subset IArray where * ~~ [Int, Int];
subset TArray where .all ~~ IArray;

Otherwise, the anonymous subset form you use seems to be the briefest, although as @raiph points out you can drop the 'of Array' piece.

like image 21
p6steve Avatar answered Nov 08 '22 02:11

p6steve