I just wanted to concatenate multiple arrays in Ruby and couldn't find a satisfying way to do so.
Example input:
foo = [1, 2, 3]
bar = [4, 5, 6]
baz = [7, 8, 9]
Expected result: (without modifying the existing arrays)
[1, 2, 3, 4, 5, 6, 7, 8, 9]
My actual arrays are much larger, so I'm interested in an efficient solution. There may also be more than three arrays, so a short syntax is preferred.
foo + bar + baz
is the obvious one, it's concise and clear. But it is evaluated as (foo + bar) + baz
. In other words: it creates an intermediate array [1, 2, 3, 4, 5, 6]
that is thrown away after the whole operation. As noted in the documentation:
repeated use of
+=
on arrays can be quite inefficient
[*foo, *bar, *baz]
basically inlines the elements which is not very efficient for large arrays, either. It also looks more like a hack to me.
[foo, bar, baz].flatten(1)
seems to be even worse than the above, performance wise.
[].concat(foo).concat(bar).concat(baz)
is the fastest, but it looks cumbersome and it needs multiple method invocations.
Shouldn't there be a simple class method for such a basic task? Something like:
Array.concat(foo, bar, baz)
Am I missing something obvious?
Ruby | Array concat() operation Array#concat() : concat() is a Array class method which returns the array after appending the two arrays together.
The OP's question, as noted in other answers, is comparing two operators that perform different purposes. One, concat , which is destructive to (mutates) the original array, and + which is non-destructive (pure functional, no mutation).
Concatenating strings or string concatenation means joining two or more strings together. In Ruby, we can use the plus + operator to do this, or we can use the double less than operator << .
The 'reduce' method can be used to take an array and reduce it to a single value. Lets start with a simple demonstration of this method. Lets say we want write a method to find the sum of all numbers in an array of numbers. You may be inclined to write your code in this manner: def sum(array) sum = 0.
If you've already determined that multiple concatenation is the fastest method, you can write it nicer using reduce:
[foo, bar, baz].reduce([], :concat)
I've created another benchmark, comparing +
, concat
and a custom C extension with a variable number of arrays.
concat
plus
is getting really slow if you concatenate many arraysAlthough "2-3x" sounds like a huge improvement, it's just a few milliseconds in absolute terms. I was expecting a bigger difference by not having to resize the array, but this is apparently not a huge factor.
IMO, concat
is a decent performer and I see no urgent need for a C extension.
My test arrays contain nil
values. Other elements don't seem to produce different results (in relative terms).
I didn't include flat_map
, because it is equivalent to concat
.
Concatenating 3 arrays of size 100 (10000 times)
user system total real
plus 0.020000 0.000000 0.020000 ( 0.027927)
concat 0.020000 0.010000 0.030000 ( 0.033204)
c_extension 0.010000 0.010000 0.020000 ( 0.010727)
Concatenating 10 arrays of size 100 (10000 times)
user system total real
plus 0.110000 0.070000 0.180000 ( 0.180417)
concat 0.050000 0.020000 0.070000 ( 0.065299)
c_extension 0.010000 0.010000 0.020000 ( 0.025475)
Concatenating 10 arrays of size 1000 (10000 times)
user system total real
plus 0.690000 0.560000 1.250000 ( 1.252319)
concat 0.180000 0.130000 0.310000 ( 0.303365)
c_extension 0.120000 0.120000 0.240000 ( 0.248589)
plus
is excluded from the following results
Concatenating 10 arrays of size 100000 (100 times)
user system total real
concat 0.220000 0.340000 0.560000 ( 0.568730)
c_extension 0.130000 0.150000 0.280000 ( 0.281354)
Concatenating 100 arrays of size 10000 (100 times)
user system total real
concat 0.210000 0.320000 0.530000 ( 0.519030)
c_extension 0.160000 0.140000 0.300000 ( 0.304751)
Concatenating 1000 arrays of size 1000 (100 times)
user system total real
concat 0.240000 0.330000 0.570000 ( 0.563511)
c_extension 0.150000 0.120000 0.270000 ( 0.283546)
Concatenating 10000 arrays of size 100 (100 times)
user system total real
concat 0.330000 0.310000 0.640000 ( 0.643987)
c_extension 0.170000 0.120000 0.290000 ( 0.286489)
Concatenating 100000 arrays of size 10 (100 times)
user system total real
concat 1.300000 0.340000 1.640000 ( 1.648687)
c_extension 0.310000 0.150000 0.460000 ( 0.458214)
Test code:
require 'benchmark'
values = [
# small
{ count: 3, size: 100, n: 10000 },
{ count: 10, size: 100, n: 10000 },
{ count: 10, size: 1000, n: 10000 },
# large
{ count: 10, size: 100000, n: 100 },
{ count: 100, size: 10000, n: 100 },
{ count: 1000, size: 1000, n: 100 },
{ count: 10000, size: 100, n: 100 },
{ count: 100000, size: 10, n: 100 }
]
values.each_with_index do |h, i|
count, size, n = h.values_at(:count, :size, :n)
arrays = Array.new(count) { Array.new(size) }
puts
puts "Concatenating #{count} arrays of size #{size} (#{n} times)"
Benchmark.bm(10) do |r|
r.report('plus') { n.times { arrays.reduce(:+) } } if i < 3
r.report('concat') { n.times { arrays.reduce([], :concat) } }
r.report('c_extension') { n.times { Array.concat(*arrays) } }
end
end
C extension: (a patch actually, I've added this to Ruby's array.c
)
VALUE
rb_ary_s_concat(int argc, VALUE *argv, VALUE klass)
{
VALUE ary;
long len = 0, i;
for (i=0; i<argc; i++) {
argv[i] = to_ary(argv[i]);
len += RARRAY_LEN(argv[i]);
}
ary = rb_ary_new2(len);
long beg = 0;
for (i=0; i<argc; i++) {
ary_memcpy(ary, beg, RARRAY_LEN(argv[i]), RARRAY_CONST_PTR(argv[i]));
beg += RARRAY_LEN(argv[i]);
}
ARY_SET_LEN(ary, len);
return ary;
}
You have to register this method in Init_Array
via:
rb_define_singleton_method(rb_cArray, "concat", rb_ary_s_concat, -1);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With