Intuitively, the latter should be faster than the former. However, I was very surprised when I saw benchmark results: <pre class="prettyprint"><code> require 'benchmark/ips' b = (0..20).to_a; y = 21; Benchmark.ips do |x| x.report('<<') { a = b.dup; a << y } x.report('+=') { a = b.dup; a += [y] } x.report('push') { a = b.dup; a.push(y) } x.report('[]=') { a = b.dup; a[a.size]=y } x.compare! end </code></pre> The result is: <pre class="prettyprint"><code>Calculating ------------------------------------- << 24.978k i/100ms += 30.389k i/100ms push 24.858k i/100ms []= 22.306k i/100ms ------------------------------------------------- << 493.125k (± 3.2%) i/s - 2.473M += 599.830k (± 2.3%) i/s - 3.009M push 476.374k (± 3.3%) i/s - 2.386M []= 470.263k (± 3.8%) i/s - 2.364M Comparison: +=: 599830.3 i/s <<: 493125.2 i/s - 1.22x slower push: 476374.0 i/s - 1.26x slower []=: 470262.8 i/s - 1.28x slower </code></pre> However, when a colleague of mine independently created his own benchmark, the result was quite the opposite: <pre class="prettyprint"><code> Benchmark.ips do |x| x.report('push') {@a = (0..20).to_a; @a.push(21)} x.report('<<') {@b = (0..20).to_a; @b << 21} x.report('+=') {@c = (0..20).to_a; @c += [21]} x.compare! end </code></pre> Result: <pre class="prettyprint"><code>Calculating ------------------------------------- push 17.623k i/100ms << 18.926k i/100ms += 16.079k i/100ms ------------------------------------------------- push 281.476k (± 4.2%) i/s - 1.410M << 288.341k (± 3.6%) i/s - 1.457M += 219.774k (± 8.3%) i/s - 1.093M Comparison: <<: 288341.4 i/s push: 281476.3 i/s - 1.02x slower +=: 219774.1 i/s - 1.31x slower </code></pre> We also cross-ran our benchmarks and on both our machines his benchmark showed that <code>+=</code> is noticeably slower than <code><<</code>, and mine showed the opposite. Why is that? UPD: my Ruby version is Ruby 2.2.3p173 (2015-08-18 revision 51636) [x86_64-darwin14]; my colleague's is 2.2.2 (Don't know full details, will update the post tomorrow). UPD2: ruby 2.2.2p95 (2015-04-13 revision 50295) [x86_64-darwin12.0] my teammate's Ruby version.

I believe this is down to how MRI allocates arrays (all of this answer is very MRI specific). Ruby tries really hard to be efficient with arrays: small arrays (<= 3 elements) are packed right into the RARRAY structure for example. Another thing is that if you take an array and start appending values one at a time, ruby doesn't grow the buffer one element at a time, it does it in chunks: this is more efficient, at the expense of a small amount of memory. One tool to see all this is using memsize_of: <pre class="prettyprint"><code>ObjectSpace.memspace_of([]) #=> 40 (on 64 bit platforms ObjectSpace.memspace_of([1,2]) #=> 40 (on 64 bit platforms ObjectSpace.memsize_of([1,2,3,4]) #=> 72 ObjectSpace.memsize_of([]<<1<<2<<3<<4) #=> 200 </code></pre> In the first 2 cases, the array is packed within the RARRAY structure so the memory size is just the base size of any object (40 bytes). In the third case ruby had to allocate a array for 4 values (8 bytes each) so the size is 40 + 32 = 72. In the last case ruby grew the storage to 20 elements This is relevant to the second case. The block inside the benchmark has a freshly created array which still has some spare capacity: <pre class="prettyprint"><code> ObjectSpace.memsize_of((0..20).to_a) #=> 336, enough for nearly 40 elements. </code></pre> <code><<</code> can just write its object to the appropriate slot, whereas <code>+=</code> has to allocate a new array (both the object and its buffer) and copy all the data. If I do <pre class="prettyprint"><code>a = [1,2,3,4,5] b = a.dup ObjectSpace.memsize_of(b) #=> 40 </code></pre> Here <code>b</code> shares its buffer with <code>a</code>, so is reported as using no memory beyond the basic object size. At the point where <code>b</code> gets written to, ruby will have to copy the data (copy on write): in the first benchmark BOTH <code>+=</code> and <code><<</code> are actually allocating a new buffer of sufficient size and copying all the data across. Here is where I get handwavy: this would completely explain things if <code><<</code> and <code>+=</code> performed identically, but that's not what is happening. My reading of things is that <code>+</code> is simpler. All it has to do, no matter what is allocate a buffer, and memcpy some data from 2 locations - this is fast. <code><<</code> on the other hand is mutating the array so it is paying the overhead of copy on write: it is doing extra work compare to <code>+=</code>. For example ruby needs to track who is sharing buffers so that garbage collection of the original array is possible when no one is sharing it anymore. A benchmark that goes someway to convincing me that this interpretation is correct is as follows: <pre class="prettyprint"><code>require 'benchmark/ips' b = (0..20).to_a.dup y = 21 Benchmark.ips do |x| x.report('<<') { a = b.dup; a << y } x.report('+=') { a = b.dup; a += [y] } x.report('<<2') { a = b.dup; a << y; a<< y} x.report('+=2') { a = b.dup; a += [y]; a += [y] } end </code></pre> This is basically the same benchmark as the original, but now appending 2 elements. For <code><<</code> the copy on write overhead will only be incurred the first time. The results I get are <pre class="prettyprint"><code> << 1.325M (± 7.6%) i/s - 6.639M += 1.742M (± 9.5%) i/s - 8.677M <<2 1.230M (±10.3%) i/s - 6.079M +=2 1.140M (±10.8%) i/s - 5.656M </code></pre> So appending to the array is back on top if you do it twice.

What is faster in Ruby, `arr += [x]` or `arr << x`

Q: Is map faster than each Ruby?

Array#map also executes the given block for each element of the array, but returns a new array whose values are the return values of each iteration of the block. Points to Remember: each loop is slightly fast as compare to map.

Q: What is .first in Ruby?

Ruby | Array class first() function first() is a Array class method which returns the first element of the array or the first 'n' elements from the array.

Q: How do you combine arrays in Ruby?

This can be done in a few ways in Ruby. The first is the plus operator. This will append one array to the end of another, creating a third array with the elements of both. Alternatively, use the concat method (the + operator and concat method are functionally equivalent).

Q: Which method is used to enumerate over an array returning an array matching the result of the executed block for each element in Ruby?

The map method iterates over an array applying a block to each element of the array and returns a new array with those results.

Tags:

performance

arrays

ruby

benchmarking

microbenchmark

Intuitively, the latter should be faster than the former. However, I was very surprised when I saw benchmark results:

  require 'benchmark/ips'

  b = (0..20).to_a;
  y = 21;
  Benchmark.ips do |x|
    x.report('<<')   { a = b.dup; a << y }
    x.report('+=')   { a = b.dup; a += [y] }
    x.report('push') { a = b.dup; a.push(y) }
    x.report('[]=')  { a = b.dup; a[a.size]=y }
    x.compare!
  end

The result is:

Calculating -------------------------------------
                  <<    24.978k i/100ms
                  +=    30.389k i/100ms
                push    24.858k i/100ms
                 []=    22.306k i/100ms
-------------------------------------------------
                  <<    493.125k (± 3.2%) i/s -      2.473M
                  +=    599.830k (± 2.3%) i/s -      3.009M
                push    476.374k (± 3.3%) i/s -      2.386M
                 []=    470.263k (± 3.8%) i/s -      2.364M

Comparison:
                  +=:   599830.3 i/s
                  <<:   493125.2 i/s - 1.22x slower
                push:   476374.0 i/s - 1.26x slower
                 []=:   470262.8 i/s - 1.28x slower

However, when a colleague of mine independently created his own benchmark, the result was quite the opposite:

 Benchmark.ips do |x|
   x.report('push') {@a = (0..20).to_a; @a.push(21)}
   x.report('<<')   {@b = (0..20).to_a; @b << 21}
   x.report('+=')   {@c = (0..20).to_a; @c += [21]}
   x.compare!
 end

Result:

Calculating -------------------------------------
                push    17.623k i/100ms
                  <<    18.926k i/100ms
                  +=    16.079k i/100ms
-------------------------------------------------
                push    281.476k (± 4.2%) i/s -      1.410M
                  <<    288.341k (± 3.6%) i/s -      1.457M
                  +=    219.774k (± 8.3%) i/s -      1.093M

Comparison:
                  <<:   288341.4 i/s
                push:   281476.3 i/s - 1.02x slower
                  +=:   219774.1 i/s - 1.31x slower

We also cross-ran our benchmarks and on both our machines his benchmark showed that += is noticeably slower than <<, and mine showed the opposite.

Why is that?

UPD: my Ruby version is Ruby 2.2.3p173 (2015-08-18 revision 51636) [x86_64-darwin14]; my colleague's is 2.2.2 (Don't know full details, will update the post tomorrow).

UPD2: ruby 2.2.2p95 (2015-04-13 revision 50295) [x86_64-darwin12.0] my teammate's Ruby version.

564

asked Dec 10 '15 16:12

DNNX

2 Answers

In my view, to simplify the comparison of various operators, we should remove the unnecessary code and keep the test simple.

require 'benchmark/ips'

y = 10
Benchmark.ips do |x|
    x.report('<<')   { a = [0,1,2,3,4,5,6,7,8,9]; a << y }
    x.report('+=')   { a = [0,1,2,3,4,5,6,7,8,9]; a += [y] }
    x.report('push') { a = [0,1,2,3,4,5,6,7,8,9]; a.push(y) }
    x.report('[]=')  { a = [0,1,2,3,4,5,6,7,8,9]; a[a.size]=y }
    x.compare!
end

The result of the above code is in line with the second code snippet shared in the question.

Calculating -------------------------------------
                  <<   101.735k i/100ms
                  +=   104.804k i/100ms
                push    92.863k i/100ms
                 []=    99.604k i/100ms
-------------------------------------------------
                  <<      2.134M (± 3.3%) i/s -     10.682M
                  +=      1.786M (±13.2%) i/s -      8.804M
                push      1.930M (±16.1%) i/s -      9.472M
                 []=      1.948M (± 7.9%) i/s -      9.761M

Comparison:
                  <<:  2134005.4 i/s
                 []=:  1948256.8 i/s - 1.10x slower
                push:  1930165.3 i/s - 1.11x slower
                  +=:  1785808.5 i/s - 1.19x slower

[Finished in 28.3s]

Why << is faster than +=?

Array#<< is fastest out of the four ways of appending an element to the array because it does just that - appends an element to the array. On the contrary, Array#+ appends an element but returns a new copy of the array - creation of new copy of array makes it slowest. (One can use toogle code option in documentation to understand additional work done by some methods)

Bench marking with dup

If we use below code for bench marking,

require 'benchmark/ips'

y = 10
Benchmark.ips do |x|
    x.report('<<')   { a = [0,1,2,3,4,5,6,7,8,9].dup; a << y }
    x.report('+=')   { a = [0,1,2,3,4,5,6,7,8,9].dup; a += [y] }
    x.report('push') { a = [0,1,2,3,4,5,6,7,8,9].dup; a.push(y) }
    x.report('[]=')  { a = [0,1,2,3,4,5,6,7,8,9].dup; a[a.size]=y }
    x.compare!
end

We see below results:

Calculating -------------------------------------
                  <<    65.225k i/100ms
                  +=    76.106k i/100ms
                push    64.864k i/100ms
                 []=    63.582k i/100ms
-------------------------------------------------
                  <<      1.221M (±14.3%) i/s -      6.001M
                  +=      1.291M (±13.1%) i/s -      6.393M
                push      1.164M (±14.1%) i/s -      5.773M
                 []=      1.168M (±14.5%) i/s -      5.722M

Comparison:
                  +=:  1290970.6 i/s
                  <<:  1221029.0 i/s - 1.06x slower
                 []=:  1168219.3 i/s - 1.11x slower
                push:  1163965.9 i/s - 1.11x slower

[Finished in 28.3s]

If we look carefully between two results, we see only one difference. The += entry has become first, whereas the order of rest of the methods remained same with the original result.

Why results flip when dup is used?

This is my wild guess, I am guessing that Ruby interpreter optimized the code and did not create a new array as part of += as it knew that it is working on freshly created copy of array by dup

145

answered Oct 07 '22 01:10

7 revs

I believe this is down to how MRI allocates arrays (all of this answer is very MRI specific). Ruby tries really hard to be efficient with arrays: small arrays (<= 3 elements) are packed right into the RARRAY structure for example.

Another thing is that if you take an array and start appending values one at a time, ruby doesn't grow the buffer one element at a time, it does it in chunks: this is more efficient, at the expense of a small amount of memory.

One tool to see all this is using memsize_of:

ObjectSpace.memspace_of([]) #=> 40 (on 64 bit platforms
ObjectSpace.memspace_of([1,2]) #=> 40 (on 64 bit platforms
ObjectSpace.memsize_of([1,2,3,4]) #=> 72
ObjectSpace.memsize_of([]<<1<<2<<3<<4) #=> 200

In the first 2 cases, the array is packed within the RARRAY structure so the memory size is just the base size of any object (40 bytes). In the third case ruby had to allocate a array for 4 values (8 bytes each) so the size is 40 + 32 = 72. In the last case ruby grew the storage to 20 elements

This is relevant to the second case. The block inside the benchmark has a freshly created array which still has some spare capacity:

 ObjectSpace.memsize_of((0..20).to_a) #=> 336, enough for nearly 40 elements.

<< can just write its object to the appropriate slot, whereas += has to allocate a new array (both the object and its buffer) and copy all the data.

If I do

a = [1,2,3,4,5]
b  = a.dup
ObjectSpace.memsize_of(b) #=> 40

Here b shares its buffer with a, so is reported as using no memory beyond the basic object size. At the point where b gets written to, ruby will have to copy the data (copy on write): in the first benchmark BOTH += and << are actually allocating a new buffer of sufficient size and copying all the data across.

Here is where I get handwavy: this would completely explain things if << and += performed identically, but that's not what is happening. My reading of things is that + is simpler. All it has to do, no matter what is allocate a buffer, and memcpy some data from 2 locations - this is fast.

<< on the other hand is mutating the array so it is paying the overhead of copy on write: it is doing extra work compare to +=. For example ruby needs to track who is sharing buffers so that garbage collection of the original array is possible when no one is sharing it anymore.

A benchmark that goes someway to convincing me that this interpretation is correct is as follows:

require 'benchmark/ips'
b = (0..20).to_a.dup
y = 21
Benchmark.ips do |x|
  x.report('<<')   { a = b.dup; a << y }
  x.report('+=')   { a = b.dup; a += [y] }

  x.report('<<2')   { a = b.dup; a << y; a<< y}
  x.report('+=2')   { a = b.dup; a += [y]; a += [y] }
end

This is basically the same benchmark as the original, but now appending 2 elements. For << the copy on write overhead will only be incurred the first time. The results I get are

              <<      1.325M (± 7.6%) i/s -      6.639M
              +=      1.742M (± 9.5%) i/s -      8.677M
             <<2      1.230M (±10.3%) i/s -      6.079M
             +=2      1.140M (±10.8%) i/s -      5.656M

So appending to the array is back on top if you do it twice.

answered Oct 07 '22 02:10

Frederick Cheung

Related questions
                            
                                Can a std::array alias a fragment of a larger array?
                            
                                T* versus char* pointer arithmetic
                            
                                React - Changing the state without using setState: Must avoid it?
                            
                                Convert array to tuple?
                            
                                Is there a way for countof() to test if its argument is an array?
                            
                                Will results of numpy.as_strided depend on input dtype?
                            
                                How do I use groovy jsonbuilder with .each to create an array?
                            
                                Numba slow when assigning to an array?
                            
                                Move duplicates to the end of a sorted array
                            
                                React - map array to child component
                            
                                How to deserialize null array to null in c#?
                            
                                C# Array of references
                            
                                Sorting array from typedef struct in C
                            
                                Javascript array becomes an object structure
                            
                                Shrink int array C
                            
                                create a hole in an array
                            
                                Comparing two arrays which contain objects
                            
                                Pb converting a list of pandas.Series into a numpy array of pandas.Series
                            
                                Cannot convert Brace-enclosed initializer list
                            
                                Pass array by reference and modify values C++

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With