What's the difference between ruby Enumerable/Array first(n)
and take(n)
?
I vaguely recall take
has something to do with lazy evaluation, but I can't figure out how to use it to do that, and can't find anything useful googling or in docs. "take" is a hard method name to google for.
first(n)
and take(n)
are documented pretty identically, not too helpful.
first → obj or nil
first(n) → an_array
Returns the first element, or the first n elements, of the enumerable. If the enumerable is empty, the first form returns nil, and the second form returns an empty array.
-
take(n) → array
Returns first n elements from enum.
Telling me "take has something to do with lazy evaluation" isn't enough, I sort of rememeber that already, I need an example of how to use it for such, compared to first
.
Well, I've looked at the source (Ruby 2.1.5). Under the hood, if first
is provided an argument, it forwards it to take
. Otherwise, it returns a single value:
static VALUE
enum_first(int argc, VALUE *argv, VALUE obj)
{
NODE *memo;
rb_check_arity(argc, 0, 1);
if (argc > 0) {
return enum_take(obj, argv[0]);
}
else {
memo = NEW_MEMO(Qnil, 0, 0);
rb_block_call(obj, id_each, 0, 0, first_i, (VALUE)memo);
return memo->u1.value;
}
}
take
, on the other hand, requires an argument and always returns an array of given size or smaller with the elements taken from the beginning.
static VALUE
enum_take(VALUE obj, VALUE n)
{
NODE *memo;
VALUE result;
long len = NUM2LONG(n);
if (len < 0) {
rb_raise(rb_eArgError, "attempt to take negative size");
}
if (len == 0) return rb_ary_new2(0);
result = rb_ary_new2(len);
memo = NEW_MEMO(result, 0, len);
rb_block_call(obj, id_each, 0, 0, take_i, (VALUE)memo);
return result;
}
So yes, that's a reason why these two are so similar. The only difference seems to be, that first
can be called without arguments and will output not an array, but a single value. <...>.first(1)
, on the other hand, is equivalent to <...>.take(1)
. As simple as that.
With lazy collections, however, things are different. first
in lazy collections is still enum_first
which is, as seen above, is a shortcut to enum_take
. take
, however, is C-coded lazy_take
:
static VALUE
lazy_take(VALUE obj, VALUE n)
{
long len = NUM2LONG(n);
VALUE lazy;
if (len < 0) {
rb_raise(rb_eArgError, "attempt to take negative size");
}
if (len == 0) {
VALUE len = INT2FIX(0);
lazy = lazy_to_enum_i(obj, sym_cycle, 1, &len, 0);
}
else {
lazy = rb_block_call(rb_cLazy, id_new, 1, &obj,
lazy_take_func, n);
}
return lazy_set_method(lazy, rb_ary_new3(1, n), lazy_take_size);
}
...that doesn't evaulate immediately, requires a .force
call for that.
And in fact, it's hinted in the docs under lazy
, it lists all the lazily implemented methods. The list does contain take
, but doesn't contain first
. That said, on lazy sequences take
stays lazy and first
doesn't.
Here's an example how these work differently:
lz = (1..Float::INFINITY).lazy.map{|i| i }
# An infinite sequence, evaluating it head-on won't do
# Ruby 2.2 also offers `.map(&:itself)`
lz.take(5)
#=> #<Enumerator::Lazy: ...>
# Well, `take` is lazy then
# Still, we need values
lz.take(5).force
#=> [1, 2, 3, 4, 5]
# Why yes, values, finally
lz.first(5)
#=> [1, 2, 3, 4, 5]
# So `first` is not lazy, it evaluates values immediately
Some extra fun can be gained by running in versions prior to 2.2 and using code for 2.2 (<...>.lazy.map(&:itself)
), because that way the moment you lose laziness will immediately raise a NoMethodError
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With