Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ruby: optimize => phrase.split(delimiter).collect {|p| p.lstrip.rstrip }

Tags:

regex

ruby

ruby: What is the most optimized expression to evaluate the same as result as with

phrase.split(delimiter).collect {|p| p.lstrip.rstrip }
like image 367
Ram on Rails React Native Avatar asked Dec 23 '22 05:12

Ram on Rails React Native


2 Answers

Optimised for clarity I would prefer the following:

phrase.split(delimiter).collect(&:strip)

But I presume you want to optimise for speed. I don't know why others are speculating. The only way to find out what is faster is to benchmark your code.

Make sure you adjust the benchmark parameters - this is just an example.

require "benchmark"

# Adjust parameters below for your typical use case.
n = 10_000
input = " This is - an example. - A relatively long string " +
  "- delimited by dashes. - Adjust if necessary " * 100
delimiter = "-"

Benchmark.bmbm do |bench|
  bench.report "collect { |s| s.lstrip.rstrip }" do
    # Your example.
    n.times { input.split(delimiter).collect { |s| s.lstrip.rstrip } }
  end

  bench.report "collect { |s| s.strip }" do
    # Use .strip instead of .lstrip.rstrip.
    n.times { input.split(delimiter).collect { |s| s.strip } }
  end

  bench.report "collect { |s| s.strip! }" do
    # Use .strip! to modifiy strings in-place.
    n.times { input.split(delimiter).collect { |s| s.strip! } }
  end

  bench.report "collect(&:strip!)" do
    # Slow block creation (&:strip! syntax).
    n.times { input.split(delimiter).collect(&:strip!) }
  end

  bench.report "split(/\\s*\#{delim}\\s*/) (static)" do
    # Use static regex -- only possible if delimiter doesn't change.
    re = Regexp.new("\s*#{delimiter}\s*")
    n.times { input.split(re) }
  end

  bench.report "split(/\\s*\#{delim}\\s*/) (dynamic)" do
    # Use dynamic regex, slower to create every time?
    n.times { input.split(Regexp.new("\s*#{delimiter}\s*")) }
  end
end

Results on my laptop with the parameters listed above:

                                      user     system      total        real
collect { |s| s.lstrip.rstrip }   7.970000   0.050000   8.020000 (  8.246598)
collect { |s| s.strip }           6.350000   0.050000   6.400000 (  6.837892)
collect { |s| s.strip! }          5.110000   0.020000   5.130000 (  5.148050)
collect(&:strip!)                 5.700000   0.030000   5.730000 (  6.010845)
split(/\s*#{delim}\s*/) (static)  6.890000   0.030000   6.920000 (  7.071058)
split(/\s*#{delim}\s*/) (dynamic) 6.900000   0.020000   6.920000 (  6.983142)

From the above I might conclude:

  • Using strip instead of .lstrip.rstrip is faster.
  • Preferring &:strip! over { |s| s.strip! } comes with a performance cost.
  • Simple regex patterns are nearly as fast as using split + strip.

Things that I can think of that may influence the result:

  • The length of the delimiter (and whether or not it is whitespace).
  • The length of the strings that you want to split.
  • The length of the splittable chunks in the string.

But don't take my word for it. Measure it!

like image 184
molf Avatar answered Mar 15 '23 18:03

molf


You could try a regular expression:

phrase.strip.split(/\s*#{delimiter}\s*/)
like image 23
Mark Byers Avatar answered Mar 15 '23 19:03

Mark Byers