I want to get the last element of a lazy but finite Seq in Raku, e.g.:
my $s = lazy gather for ^10 { take $_ };
The following don't work:
say $s[* - 1];
say $s.tail;
These ones work but don't seem too idiomatic:
say (for $s<> { $_ }).tail;
say (for $s<> { $_ })[* - 1];
What is the most idiomatic way of doing this while keeping the original Seq lazy?
These are called sequences, which are of type Seq. As it so happens, loops return Seq s. So, it is fine to have infinite lists in Raku, just so long as you never ask them for all their elements.
From version 6.d, .raku (again, .perl before version 2019.11) can be called on consumed Seq. If a program assumes a Seq can only iterate once, but then is later changed to call one of these operations during the loop, that assumption will fail.
The first element of a list is at index number zero: Variables in Raku whose names bear the @ sigil are expected to contain some sort of list-like object. Of course, other variables may also contain these objects, but @ -sigiled variables always do, and are expected to act the part.
lazy-seq is just one of many possible ways to create lazy sequences. And there are several other ways to do it in clojure. If a sequence is not lazy, it often holds onto it’s head, which consumes a lot of heap space. If it is lazy, it is computed, then discarded as it is not used for further computations.
What you're asking about ("get[ing] the last element of a lazy but finite Seq … while keeping the original Seq lazy") isn't possible. I don't mean that it's not possible with Raku – I mean that, in principle, it's not possible for any language that defines "laziness" the way Raku does with, for example, the is-lazy
method.
If particular, when a Seq is lazy in Raku, that "means that [the Seq's] values are computed on demand and stored for later use." Additionally, one of the defining features of a lazy iterable is that it cannot know its own length while remaining lazy – that's why calling .elems
on a lazy iterable throws an error:
my $s = lazy gather for ^10 { take $_ };
say $s.is-lazy; # OUTPUT: «True»
$s.elems; # THROWS: «Cannot .elems a lazy list onto a Seq»
Now, at this point, you might reasonably be thinking "well, maybe Raku doesn't know how long $s
is, but I can tell that it has exactly 10 elements in it." And you're not wrong – with that code, $s
is indeed guaranteed to have 10 elements. This means that, if you want to get the tenth (last) element of $s
, you can do so with $s[9]
. And accessing $s
's tenth element like that won't change the fact that $s.is-lazy
.
But, importantly, you can only do so because you know something "extra" about $s
, and that extra info undoes a good chunk of the reason you might want a list to be lazy in practice.
To see what I mean, consider a very similar Seq
my $s2 = lazy gather for ^10 { last if rand > .95; take $_ };
say $s2.is-lazy; # OUTPUT: «True»
Now, $s2
probably has 10 elements, but it might not – the only way to know is to iterate through it and find out. In turn, this means $s2[9]
does not jump to the tenth element the way $s[9]
did; it iterates through $s2
just like you'd need to. And, as a result, if you run $s2[9]
, then $s2
will no longer be lazy (i.e., $s2.is-lazy
will return False
).
And this is, in effect, what you did in the code in your question:
my $s = lazy gather for ^10 { take $_ };
say $s.is-lazy; # OUTPUT: «True»
say (for $s<> { $_ }).tail; # OUTPUT: «9»
say $s.is-lazy; # OUTPUT: «False»
Because Raku cannot ever know that it has reached the tail
of a lazy Seq, the only way it could tell you the .tail
is to fully iterate $s
. And that necessarily means that $s
is no longer lazy.
It's worth mentioning two adjacent topics that aren't actually related but that are close enough that they trip some people up.
First, nothing I've said about lazy iterables not knowing their length precludes some non-lazy iterables from knowing their length. Indeed, a decent number of Raku types do both the Iterator role and the PredictiveIterator role – and the main point of a PredictiveIterator
is that it does know how many elements it can produce without needing to produce/iterate them. But PredictiveIterators
cannot be lazy.
The second potentially confusing topic is closely related to the first: while no PredictiveIterator
can be lazy (that is, none will ever have an .is-lazy
method that returns True
), some PredictiveIterator
s have behavior that is very similar to laziness – and, in fact, may even be colloquially referred to as "lazy".
I can't do a great job explaining this distinction because, quite honestly, I don't fully understand it myself. But I can give you an example: the .lines method on an IO::Handle
. It's certainly the case that reading the lines of a huge file behaves a lot like it's dealing with a lazy iterable. most obviously, you can process each line without ever having the whole file in memory. And the docs even say that "lines are read lazily" with the .lines
method.
On the other hand:
my $l = 'some-file-with-100_000-lines.txt'.IO.lines;
say $l.is-lazy; # OUTPUT: «False»
say $l.iterator ~~ PredictiveIterator; # OUTPUT: «True»
say $l.elems; # OUTPUT: «100000»
So I'm not quite sure whether it's fair to say that $l
"is a lazy iterable", but if it is, it's "lazy" in a different way than $s
was.
I realize that was a lot, but I hope it is helpful. If you have a more specific use case in mind for laziness (I bet it wasn't gathering the numbers from zero to nine!), I'd be happy to address that more specifically. And if anyone else can fill in some of the details with .lines
and other lazy-not-lazy PredictiveIterator
s, I'd really appreciate it!
lazy
Lazy sequences in Raku are designed to work well as is. You don't need to emphasize they're lazy by adding an explicit lazy
.
If you add an explicit lazy
, Raku interprets that as a request to block operations such as .tail
because they will almost certainly immediately render laziness moot, and, if called on an infinite sequence, or even just a sufficiently large one, hang or OOM the program.
So, either drop the lazy
, or don't invoke operations like .tail
that will be blocked if you do.
As noted by @ugexe, the idiomatic solution is to drop the lazy
.
Quoting my answer to the SO About Laziness:
if a
gather
is asked if it's lazy, it returnsFalse
.
Aiui, something like the following applies:
Some lazy sequence producers may be actually or effectively infinite. If so, calling .tail
etc on them will hang the calling program. Conversely, other lazy sequences perform fine when all their values are consumed in one go. How should Raku distinguish between these two scenarios?
A decision was made in 2015 to let value producing datatypes emphasize or deemphasize their laziness via their response to an .is-lazy
call.
Returning True
signals that a sequence is not only lazy but wants to be known to be lazy by consuming code that calls .is-lazy
. (Not so much end-user code but instead built in consuming features such as @
sigilled variables handling an assignment trying to determine whether or not to assign eagerly.) Built in consuming features take a True
as a signal they ought block calls like .tail
. If a dev knows this is overly conservative, they can add an eager
(or remove an unneeded lazy
).
Conversely, a datatype, or even a particular object instance, may return False
to signal that it does not want to be considered lazy. This may be because the actual behaviour of a particular datatype or instance is eager, but it might instead be that it is lazy technically, but doesn't want a consumer to block operations such as .tail
because it knows they will not be harmful, or at least prefers to have that be the default presumption. If a dev knows better (because, say, it hangs the program), or at least does not want to block potentially problematic operations, they can add a lazy
(or remove an unneeded eager
).
I think this approach works well, but it doc and error messages mentioning "lazy" may not have caught up with the shift made in 2015. So:
If you've been confused by some doc about laziness, please search for doc issues with "lazy" in them, or "laziness", and add comments to existing issues, or file a new doc issue (perhaps linking to this SO answer).
If you've been confused by a Rakudo error message mentioning laziness, please search for Rakudo issues with "lazy" in them, and tagged [LTA]
(which means "Less Than Awesome"), and add comments, or file a new Rakudo issue (with an [LTA]
tag, and perhaps a link to this SO answer).
the docs ... say “If you want to force lazy evaluation use the
lazy
subroutine or method. Binding to a scalar or sigilless container will also force laziness.”
Yes. Aiui this is correct.
[which] sounds like it implies “
my $x := lazy gather { ... }
is the same asmy $x := gather { ... }
”.
No.
An explicit lazy
statement prefix or method adds emphasis to laziness, and Raku interprets that to mean it ought block operations like .tail
in case they hang the program.
In contrast, binding to a variable alters neither emphasis nor deemphasis of laziness, merely relaying onward whatever the bound producer datatype/instance has chosen to convey via .is-lazy
.
not only in connection with
gather
but elsewhere as well
Yes. It's about the result of .is-lazy
:
my $x = (1, { .say; $_ + 1 } ... 1000);
my $y = lazy (1, { .say; $_ + 1 } ... 1000);
both act lazily ... but
$x.tail
is possible while$y.tail
is not.
Yes.
An explicit lazy
statement prefix or method forces the answer to .is-lazy
to be True
. This signals to a consumer that cares about the dangers of laziness that it should become cautious (eg rejecting .tail
etc.).
(Conversely, an eager
statement prefix or method can be used to force the answer to .is-lazy
to be False
, making timid consumers accept .tail
etc calls.)
I take from this that there are two kinds of laziness in Raku, and one has to be careful to see which one is being used where.
It's two kinds of what I'll call consumption guidance:
Don't-tail-me If an object returns True
from an .is-lazy
call then it is treated as if it might be infinite. Thus operations like .tail
are blocked.
You-can-tail-me If an object returns False
from an .is-lazy
call then operations like .tail
are accepted.
It's not so much that there's a need to be careful about which of these two kinds is in play, but if one wants to call operations like tail
, then one may need to enable that by inserting an eager
or removing a lazy
, and one must take responsibility for the consequences:
If the program hangs due to use of .tail
, well, DIHWIDT.
If you suddenly consume all of a lazy sequence and haven't cached it, well, maybe you should cache it.
Etc.
What I would say is that the error messages and/or doc may well need to be improved.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With