Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is the shovel operator (<<) preferred over plus-equals (+=) when building a string in Ruby?

People also ask

What does+= mean in Ruby?

<< and + are methods (in Ruby, santa << ' Nick' is the same as santa. <<(' Nick') ), while += is a shortcut combining assignment and the concatenation method.

What is the shovel operator Ruby?

Commonly referred to as the “shovel operator”, << is a method in Ruby that is commonly used to push an object onto an array, but you can shovel into strings as well.


Proof:

a = 'foo'
a.object_id #=> 2154889340
a << 'bar'
a.object_id #=> 2154889340
a += 'quux'
a.object_id #=> 2154742560

So << alters the original string rather than creating a new one. The reason for this is that in ruby a += b is syntactic shorthand for a = a + b (the same goes for the other <op>= operators) which is an assignment. On the other hand << is an alias of concat() which alters the receiver in-place.


Performance proof:

#!/usr/bin/env ruby

require 'benchmark'

Benchmark.bmbm do |x|
  x.report('+= :') do
    s = ""
    10000.times { s += "something " }
  end
  x.report('<< :') do
    s = ""
    10000.times { s << "something " }
  end
end

# Rehearsal ----------------------------------------
# += :   0.450000   0.010000   0.460000 (  0.465936)
# << :   0.010000   0.000000   0.010000 (  0.009451)
# ------------------------------- total: 0.470000sec
# 
#            user     system      total        real
# += :   0.270000   0.010000   0.280000 (  0.277945)
# << :   0.000000   0.000000   0.000000 (  0.003043)

A friend who is learning Ruby as his first programming language asked me this same question while going through Strings in Ruby on the Ruby Koans series. I explained it to him using the following analogy;

You have a glass of water that is half full and you need to refill your glass.

First way you do it by taking a new glass, filling it halfway with water from a tap and then using this second half-full glass to refill your drinking glass. You do this every time you need to refill your glass.

The second way you take your half full glass and just refill it with water straight from the tap.

At the end of the day, you would have more glasses to clean if you choose to pick a new glass every time you needed to refill your glass.

The same applies to the shovel operator and the plus equal operator. Plus equal operator picks a new 'glass' every time it needs to refill its glass while the shovel operator just takes the same glass and refills it. At the end of the day more 'glass' collection for the Plus equal operator.


This is an old question, but I just ran across it and I'm not fully satisfied with the existing answers. There are lots of good points about the shovel << being faster than concatenation +=, but there is also a semantic consideration.

The accepted answer from @noodl shows that << modifies the existing object in place, whereas += creates a new object. So you need to consider if you want all references to the string to reflect the new value, or do you want to leave the existing references alone and create a new string value to use locally. If you need all references to reflect the updated value, then you need to use <<. If you want to leave other references alone, then you need to use +=.

A very common case is that there is only a single reference to the string. In this case, the semantic difference does not matter and it is natural to prefer << because of its speed.


Because it's faster / does not create a copy of the string <-> garbage collector does not need to run.


While a majority of answers cover += is slower because it creates a new copy, it's important to keep in mind that += and << are not interchangeable! You want to use each in different cases.

Using << will also alter any variables that are pointed to b. Here we also mutate a when we may not want to.

2.3.1 :001 > a = "hello"
 => "hello"
2.3.1 :002 > b = a
 => "hello"
2.3.1 :003 > b << " world"
 => "hello world"
2.3.1 :004 > a
 => "hello world"

Because += makes a new copy, it also leaves any variables that are pointing to it unchanged.

2.3.1 :001 > a = "hello"
 => "hello"
2.3.1 :002 > b = a
 => "hello"
2.3.1 :003 > b += " world"
 => "hello world"
2.3.1 :004 > a
 => "hello"

Understanding this distinction can save you a lot of headaches when you're dealing with loops!