Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to efficiently concatenate multiple arrays in Ruby?

I just wanted to concatenate multiple arrays in Ruby and couldn't find a satisfying way to do so.

Example input:

foo = [1, 2, 3]
bar = [4, 5, 6]
baz = [7, 8, 9]

Expected result: (without modifying the existing arrays)

[1, 2, 3, 4, 5, 6, 7, 8, 9]

My actual arrays are much larger, so I'm interested in an efficient solution. There may also be more than three arrays, so a short syntax is preferred.

What I have tried so far

  • foo + bar + baz is the obvious one, it's concise and clear. But it is evaluated as (foo + bar) + baz. In other words: it creates an intermediate array [1, 2, 3, 4, 5, 6] that is thrown away after the whole operation. As noted in the documentation:

    repeated use of += on arrays can be quite inefficient

  • [*foo, *bar, *baz] basically inlines the elements which is not very efficient for large arrays, either. It also looks more like a hack to me.

  • [foo, bar, baz].flatten(1) seems to be even worse than the above, performance wise.

  • [].concat(foo).concat(bar).concat(baz) is the fastest, but it looks cumbersome and it needs multiple method invocations.

Shouldn't there be a simple class method for such a basic task? Something like:

Array.concat(foo, bar, baz)

Am I missing something obvious?

like image 641
Stefan Avatar asked Nov 09 '16 15:11

Stefan


People also ask

How do you append an array into another array in Ruby?

Ruby | Array concat() operation Array#concat() : concat() is a Array class method which returns the array after appending the two arrays together.

Does concat mutate Ruby?

The OP's question, as noted in other answers, is comparing two operators that perform different purposes. One, concat , which is destructive to (mutates) the original array, and + which is non-destructive (pure functional, no mutation).

How do you concatenate strings in Ruby?

Concatenating strings or string concatenation means joining two or more strings together. In Ruby, we can use the plus + operator to do this, or we can use the double less than operator << .

What does reduce function do in Ruby?

The 'reduce' method can be used to take an array and reduce it to a single value. Lets start with a simple demonstration of this method. Lets say we want write a method to find the sum of all numbers in an array of numbers. You may be inclined to write your code in this manner: def sum(array) sum = 0.


2 Answers

If you've already determined that multiple concatenation is the fastest method, you can write it nicer using reduce:

[foo, bar, baz].reduce([], :concat)
like image 158
Max Avatar answered Nov 01 '22 10:11

Max


I've created another benchmark, comparing +, concat and a custom C extension with a variable number of arrays.

Result

  • the C extension was always fastest and roughly 2-3x faster than concat
  • plus is getting really slow if you concatenate many arrays

Conclusion

Although "2-3x" sounds like a huge improvement, it's just a few milliseconds in absolute terms. I was expecting a bigger difference by not having to resize the array, but this is apparently not a huge factor.

IMO, concat is a decent performer and I see no urgent need for a C extension.


My test arrays contain nil values. Other elements don't seem to produce different results (in relative terms).

I didn't include flat_map, because it is equivalent to concat.

Concatenating 3 arrays of size 100 (10000 times)
                 user     system      total        real
plus         0.020000   0.000000   0.020000 (  0.027927)
concat       0.020000   0.010000   0.030000 (  0.033204)
c_extension  0.010000   0.010000   0.020000 (  0.010727)

Concatenating 10 arrays of size 100 (10000 times)
                 user     system      total        real
plus         0.110000   0.070000   0.180000 (  0.180417)
concat       0.050000   0.020000   0.070000 (  0.065299)
c_extension  0.010000   0.010000   0.020000 (  0.025475)

Concatenating 10 arrays of size 1000 (10000 times)
                 user     system      total        real
plus         0.690000   0.560000   1.250000 (  1.252319)
concat       0.180000   0.130000   0.310000 (  0.303365)
c_extension  0.120000   0.120000   0.240000 (  0.248589)

plus is excluded from the following results

Concatenating 10 arrays of size 100000 (100 times)
                 user     system      total        real
concat       0.220000   0.340000   0.560000 (  0.568730)
c_extension  0.130000   0.150000   0.280000 (  0.281354)

Concatenating 100 arrays of size 10000 (100 times)
                 user     system      total        real
concat       0.210000   0.320000   0.530000 (  0.519030)
c_extension  0.160000   0.140000   0.300000 (  0.304751)

Concatenating 1000 arrays of size 1000 (100 times)
                 user     system      total        real
concat       0.240000   0.330000   0.570000 (  0.563511)
c_extension  0.150000   0.120000   0.270000 (  0.283546)

Concatenating 10000 arrays of size 100 (100 times)
                 user     system      total        real
concat       0.330000   0.310000   0.640000 (  0.643987)
c_extension  0.170000   0.120000   0.290000 (  0.286489)

Concatenating 100000 arrays of size 10 (100 times)
                 user     system      total        real
concat       1.300000   0.340000   1.640000 (  1.648687)
c_extension  0.310000   0.150000   0.460000 (  0.458214)

Test code:

require 'benchmark'

values = [
  # small
  { count: 3,      size: 100,    n: 10000 },
  { count: 10,     size: 100,    n: 10000 },
  { count: 10,     size: 1000,   n: 10000 },
  # large
  { count: 10,      size: 100000, n: 100 },
  { count: 100,     size: 10000,  n: 100 },
  { count: 1000,    size: 1000,   n: 100 },
  { count: 10000,   size: 100,    n: 100 },
  { count: 100000,  size: 10,     n: 100 }
]

values.each_with_index do |h, i|
  count, size, n = h.values_at(:count, :size, :n)
  arrays = Array.new(count) { Array.new(size) }

  puts
  puts "Concatenating #{count} arrays of size #{size} (#{n} times)"
  Benchmark.bm(10) do |r|
    r.report('plus')        { n.times { arrays.reduce(:+) } } if i < 3
    r.report('concat')      { n.times { arrays.reduce([], :concat) } }
    r.report('c_extension') { n.times { Array.concat(*arrays) } }
  end
end

C extension: (a patch actually, I've added this to Ruby's array.c)

VALUE
rb_ary_s_concat(int argc, VALUE *argv, VALUE klass)
{
  VALUE ary;
  long len = 0, i;
  for (i=0; i<argc; i++) {
    argv[i] = to_ary(argv[i]);
    len += RARRAY_LEN(argv[i]);
  }
  ary = rb_ary_new2(len);
  long beg = 0;
  for (i=0; i<argc; i++) {
    ary_memcpy(ary, beg, RARRAY_LEN(argv[i]), RARRAY_CONST_PTR(argv[i]));
    beg += RARRAY_LEN(argv[i]);
  }
  ARY_SET_LEN(ary, len);
  return ary;
}

You have to register this method in Init_Array via:

rb_define_singleton_method(rb_cArray, "concat", rb_ary_s_concat, -1);
like image 43
Stefan Avatar answered Nov 01 '22 10:11

Stefan