Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Ruby's digits method performance

I'm solving some Project Euler problems using Ruby, and specifically here I'm talking about problem 25 (What is the index of the first term in the Fibonacci sequence to contain 1000 digits?).

At first, I was using Ruby 2.2.3 and I coded the problem as such:

number = 3
a = 1
b = 2

while b.to_s.length < 1000
  a, b = b, a + b
  number += 1
end
puts number

But then I found out that version 2.4.2 has a method called digits which is exactly what I needed. I transformed to code to:

while b.digits.length < 1000

And when I compared the two methods, digits was much slower.

Time

./025/problem025.rb 0.13s user 0.02s system 80% cpu 0.190 total

./025/problem025.rb 2.19s user 0.03s system 97% cpu 2.275 total

Does anyone have an idea why?

like image 491
Frank Kair Avatar asked Dec 23 '17 15:12

Frank Kair


1 Answers

Ruby's digits

  • ... is implemented in rb_int_digits.
  • Which for non-tiny numbers (i.e., most of your numbers) uses rb_int_digits_bigbase.
  • Which extracts digit after digit naively with division/modulo by base.
  • So it should take quadratic time (at least with a small base such as 10).

Ruby's to_s

  • ... is implemented in int_to_s.
  • Which uses rb_int2str.
  • Which for non-tiny numbers uses rb_big2str.
  • Which uses rb_big2str1.
  • Which might use big2str_gmp if available (which sounds/looks like it uses the fast GMP library) or ...
  • ... uses big2str_generic.
  • Which uses big2str_karatsuba (sweet, I recognize that name!).
  • Which looks like it has something to do with ...
  • ... Karatsuba's algorithm, which is a fast multiplication algorithm. If you multiply two n-digit numbers the naive way you learned in school, you take n2 single-digit products. Karatsuba on the other hand only needs about n1.585, which is quite a lot better. And I didn't read into this further, but I suspect what Ruby does here is also this efficient. Eric Lippert's answer with a base conversion algorithm uses Karatsuba multiplication and says "this [base conversion] algorithm is utterly dominated by the cost of the multiplication".

Comparing quadratic to n1.585 over the number lengths from 1 digit to 1000 digits gives factor 15:

(1..1000).sum { |i| i**2 } / (1..1000).sum { |i| i**1.585 }
=> 15.150583254950678

Which is roughly the factor you observed as well. Of course that's a rather naive comparison, but, well, why not.

GMP by the way apparently uses/used a "near O(n * log(n)) FFT-based multiplication algorithm".

Thanks to @Drenmi's answer for motivating me to dig into the source after all. I hope I did this right, no guarantees, I'm a Ruby beginner. But that's why I left all the links there for you to check for yourself :-P

like image 54
Stefan Pochmann Avatar answered Sep 20 '22 00:09

Stefan Pochmann