can somebody explain why is the following trivial code (implementation of Euclid's algorithm to find greatest common denominator) about 3 times slower then equivalent code in Ruby ? contents of iter_gcd.py: <pre class="prettyprint"><code>from sys import argv,stderr def gcd(m, n): if n > m: m, n = n, m while n != 0: rem = m % n m = n n = rem return m # in Python3 code there is xrange replaced with range function def main(a1, a2): comp = 0 for j in xrange(a1, 1, -1): for i in xrange(1, a2): comp += gcd(i,j) print(comp) if __name__ == '__main__': if len(argv) != 3: stderr.write('usage: {0:s} num1 num2\n'.format(argv[0])) exit(1) else: main(int(argv[1]), int(argv[2])) </code></pre> contents of iter_gcd.rb: <pre class="prettyprint"><code>def gcd(m, n) while n != 0 rem = m % n m = n n = rem end return m end def main(a1, a2) comp = 0 a1.downto 2 do |j| 1.upto (a2 - 1) do |i| comp += gcd(i,j) end end puts comp end if __FILE__ == $0 if ARGV.length != 2 $stderr.puts('usage: %s num1 num2' % $0) exit(1) else main(ARGV[0].to_i, ARGV[1].to_i) end end </code></pre> Execution times measurements: <pre class="prettyprint"><code>$ time python iter_gcd.py 4000 3000 61356305 real 0m22.890s user 0m22.867s sys 0m0.006s $ python -V Python 2.6.4 $ time python3 iter_gcd.py 4000 3000 61356305 real 0m18.634s user 0m18.615s sys 0m0.009s $ python3 -V Python 3.1.2 $ time ruby iter_gcd.rb 4000 3000 61356305 real 0m7.619s user 0m7.616s sys 0m0.003s $ ruby -v ruby 1.9.2p0 (2010-08-18 revision 29036) [x86_64-linux] </code></pre> Just curious why I got such results. I considered CPython to be faster in most cases then MRI and even the new Ruby 1.9 on YARV but this "microbenchmark" did really surprised me. Btw, I know I can use specialised library function like fractions.gcd but I'd like to compare implementations of such basic and trivial language constructs. Did I miss something or is the implementation of the next Ruby generation so much improved in area of sheer speed ?

I can confirm that ruby1.9 is faster than CPython for this "microbenchmark" on my machine: <pre class="prettyprint"><code>| Interpreter | Time, s | Ratio | |---------------------------------+---------+-------| | python-2.6 (cython_gcd.gcd_int) | 2.8 | 0.33 | | pypy-1.4 | 3.5 | 0.41 | | jython-2.5 (java "1.6.0_20") | 4.7 | 0.55 | | python-2.6 (cython_gcd.gcd) | 5.6 | 0.65 | | ruby-1.9 | 8.6 | 1.00 | | jython-2.5 | 8.9 | 1.03 | | python-3.2 | 11.0 | 1.28 | | python-2.6 | 15.9 | 1.85 | | ruby-1.8 | 42.6 | 4.95 | #+TBLFM: $3=$2/@6$2;%.2f </code></pre> Profiler (<code>python -mcProfile iter_gcd.py 4000 3000</code>) shows that 80% of the time spent calling <code>gcd()</code> function, so indeed the difference is in the <code>gcd()</code> function. I wrote <code>cython_gcd</code> extension for Python using Cython, <code>cython_gcd.pyx</code>: <pre class="prettyprint"><code>def gcd(m, n): while n: n, m = m % n, n return m def gcd_int(int m, int n): while n: n, m = m % n, n return m </code></pre> It is used in the <code>iter_gcd.py</code> as follows <code>from cython_gcd import gcd, gcd_int</code>. To try the extension, run: <code>python setup.py build_ext --inplace</code>, where <code>setup.py</code>: <pre class="prettyprint"><code>from distutils.core import setup from distutils.extension import Extension from Cython.Distutils import build_ext ext_modules = [Extension("cython_gcd", ["cython_gcd.pyx"])] setup( name = 'Hello world app', cmdclass = {'build_ext': build_ext}, ext_modules = ext_modules ) </code></pre> To install the extension globally, run <code>python setup.py install</code>.

Why is equivalent Python code so much slower

Tags:

python

python-3.x

ruby

can somebody explain why is the following trivial code (implementation of Euclid's algorithm to find greatest common denominator) about 3 times slower then equivalent code in Ruby ?

contents of iter_gcd.py:

from sys import argv,stderr

def gcd(m, n):
    if n > m:
        m, n = n, m
    while n != 0:
        rem = m % n
        m = n
        n = rem
    return m

# in Python3 code there is xrange replaced with range function
def main(a1, a2):
    comp = 0
    for j in xrange(a1, 1, -1):
        for i in xrange(1, a2):
            comp += gcd(i,j)

    print(comp)

if __name__ == '__main__':
    if len(argv) != 3:
        stderr.write('usage: {0:s} num1 num2\n'.format(argv[0]))
        exit(1)
    else:
        main(int(argv[1]), int(argv[2]))

contents of iter_gcd.rb:

def gcd(m, n)
    while n != 0
        rem = m % n
        m = n
        n = rem
    end
    return m
end

def main(a1, a2)
    comp = 0
    a1.downto 2 do
        |j|
        1.upto (a2 - 1) do
            |i|
            comp += gcd(i,j)
        end
    end
    puts comp
end

 if __FILE__ == $0
    if ARGV.length != 2
        $stderr.puts('usage: %s num1 num2' % $0)
        exit(1)
    else
        main(ARGV[0].to_i, ARGV[1].to_i)
    end
end

Execution times measurements:

$ time python iter_gcd.py 4000 3000
61356305

real    0m22.890s
user    0m22.867s
sys     0m0.006s

$ python -V
Python 2.6.4


$ time python3 iter_gcd.py 4000 3000
61356305

real    0m18.634s
user    0m18.615s
sys     0m0.009s

$ python3 -V
Python 3.1.2


$ time ruby iter_gcd.rb 4000 3000
61356305

real    0m7.619s
user    0m7.616s
sys     0m0.003s

$ ruby -v
ruby 1.9.2p0 (2010-08-18 revision 29036) [x86_64-linux]

Just curious why I got such results. I considered CPython to be faster in most cases then MRI and even the new Ruby 1.9 on YARV but this "microbenchmark" did really surprised me.

Btw, I know I can use specialised library function like fractions.gcd but I'd like to compare implementations of such basic and trivial language constructs.

Did I miss something or is the implementation of the next Ruby generation so much improved in area of sheer speed ?

202

asked Nov 29 '10 16:11

David Unric

2 Answers

Summary

"Because the function call overhead in Python is much larger than in Ruby."

Details

Being a microbenchmark, this really doesn't say much about the performance of either language in proper use. Likely you would want to rewrite the program to take advantage of the strengths of Python and Ruby, but this does illustrate one of the weak points of Python at the moment. The root cause of the speed differences come from function call overhead. I made a few tests to illustrate. See below for code and more details. For the Python tests, I used 2000 for both gcd parameters.

Interpreter: Python 2.6.6
Program type: gcd using function call
Total CPU time: 29.336 seconds

Interpreter: Python 2.6.6
Program type: gcd using inline code
Total CPU time: 13.194 seconds

Interpreter: Python 2.6.6
Program type: gcd using inline code, with dummy function call
Total CPU  time: 30.672 seconds

This tells us that it's not the calculation made by the gcd function that contributes most to the time difference, it's the function call itself. With Python 3.1, the difference is similar:

Interpreter: Python 3.1.3rc1
Program type: gcd using function call
Total CPU time: 30.920 seconds

Interpreter: Python 3.1.3rc1
Program type: gcd using inline code
Total CPU time: 15.185 seconds

Interpreter: Python 3.1.3rc1
Program type: gcd using inline code, with dummy function call
Total CPU time: 33.739 seconds

Again, the actual calculation is not biggest contributor, it's the function call itself. In Ruby, the function call overhead is much smaller. (Note: I had to use smaller parameters (200) for the Ruby version of the programs because the Ruby profiler really slows down real-time performance. That doesn't affect CPU time performance, though.)

Interpreter: ruby 1.9.2p0 (2010-08-18 revision 29036) [i486-linux]
Program type: gcd using function call
Total CPU time: 21.66 seconds

Interpreter: ruby 1.9.2p0 (2010-08-18 revision 29036) [i486-linux]
Program type: gcd using inline code
Total CPU time: 21.31 seconds

Interpreter: ruby 1.8.7 (2010-08-16 patchlevel 302) [i486-linux]
Program type: gcd using function call
Total CPU time: 27.00 seconds

Interpreter: ruby 1.8.7 (2010-08-16 patchlevel 302) [i486-linux]
Program type: gcd using inline code
Total CPU time: 24.83 seconds

Notice how neither Ruby 1.8 nor 1.9 suffer greatly from the gcd function call – the function call vs. inline version are more or less equal. Ruby 1.9 seems to be a little better with less difference between the function call and inline versions.

So the answer to the question is: "because the function call overhead in Python is much larger than in Ruby".

Code

# iter_gcd -- Python 2.x version, with gcd function call
#             Python 3.x version uses range instead of xrange
from sys import argv,stderr

def gcd(m, n):
    if n > m:
        m, n = n, m
    while n != 0:
        rem = m % n
        m = n
        n = rem
    return m

def main(a1, a2):
    comp = 0
    for j in xrange(a1, 1, -1):
        for i in xrange(1, a2):
            comp += gcd(i,j)
    print(comp)

if __name__ == '__main__':
    if len(argv) != 3:
        stderr.write('usage: {0:s} num1 num2\n'.format(argv[0]))
        exit(1)
    else:
        main(int(argv[1]), int(argv[2]))

# iter_gcd -- Python 2.x version, inline calculation
#             Python 3.x version uses range instead of xrange
from sys import argv,stderr

def main(a1, a2):
    comp = 0
    for j in xrange(a1, 1, -1):
        for i in xrange(1, a2):
            if i < j:
                m, n = j, i
            else:
                m, n = i, j
            while n != 0:
                rem = m % n
                m = n
                n = rem
            comp += m
    print(comp)

if __name__ == '__main__':
    if len(argv) != 3:
        stderr.write('usage: {0:s} num1 num2\n'.format(argv[0]))
        exit(1)
    else:
        main(int(argv[1]), int(argv[2]))

# iter_gcd -- Python 2.x version, inline calculation, dummy function call
#             Python 3.x version uses range instead of xrange
from sys import argv,stderr

def dummyfunc(n, m):
    a = n + m

def main(a1, a2):
    comp = 0
    for j in xrange(a1, 1, -1):
        for i in xrange(1, a2):
            if i < j:
                m, n = j, i
            else:
                m, n = i, j
            while n != 0:
                rem = m % n
                m = n
                n = rem
            comp += m
            dummyfunc(i, j)
    print(comp)

if __name__ == '__main__':
    if len(argv) != 3:
        stderr.write('usage: {0:s} num1 num2\n'.format(argv[0]))
        exit(1)
    else:
        main(int(argv[1]), int(argv[2]))

# iter_gcd -- Ruby version, with gcd function call

def gcd(m, n)
    if n > m
        m, n = n, m
    end
    while n != 0
        rem = m % n
        m = n
        n = rem
    end
    return m
end

def main(a1, a2)
    comp = 0
    a1.downto 2 do
        |j|
        1.upto a2-1 do
            |i|
            comp += gcd(i,j)
        end
    end
    puts comp
end

 if __FILE__ == $0
    if ARGV.length != 2
        $stderr.puts('usage: %s num1 num2' % $0)
        exit(1)
    else
        main(ARGV[0].to_i, ARGV[1].to_i)
    end
end

# iter_gcd -- Ruby version, with inline gcd

def main(a1, a2)
    comp = 0
    a1.downto 2 do |j|
        1.upto a2-1 do |i|
            m, n = i, j
            if n > m
                m, n = n, m
            end
            while n != 0
                rem = m % n
                m = n
                n = rem
            end
            comp += m
        end
    end
    puts comp
end

 if __FILE__ == $0
    if ARGV.length != 2
        $stderr.puts('usage: %s num1 num2' % $0)
        exit(1)
    else
        main(ARGV[0].to_i, ARGV[1].to_i)
    end
end

Test runs

Finally, the commands used to run Python and Ruby with profiling to get the numbers for comparison were pythonX.X -m cProfile iter_gcdX.py 2000 2000 for Python and rubyX.X -rprofile iter_gcdX.rb 200 200 for Ruby. The reason for the difference is that the Ruby profiler adds a lot of overhead. The results are still valid because I'm comparing the difference between a function call and inline code, not the difference between Python and Ruby as such.

| Interpreter                     | Time, s | Ratio |
|---------------------------------+---------+-------|
| python-2.6 (cython_gcd.gcd_int) |     2.8 |  0.33 |
| pypy-1.4                        |     3.5 |  0.41 |
| jython-2.5 (java "1.6.0_20")    |     4.7 |  0.55 |
| python-2.6 (cython_gcd.gcd)     |     5.6 |  0.65 |
| ruby-1.9                        |     8.6 |  1.00 |
| jython-2.5                      |     8.9 |  1.03 |
| python-3.2                      |    11.0 |  1.28 |
| python-2.6                      |    15.9 |  1.85 |
| ruby-1.8                        |    42.6 |  4.95 |
#+TBLFM: $3=$2/@6$2;%.2f

Profiler (python -mcProfile iter_gcd.py 4000 3000) shows that 80% of the time spent calling gcd() function, so indeed the difference is in the gcd() function.

I wrote cython_gcd extension for Python using Cython, cython_gcd.pyx:

def gcd(m, n):
    while n:
        n, m = m % n, n
    return m

def gcd_int(int m, int n):
    while n:
        n, m = m % n, n
    return m

It is used in the iter_gcd.py as follows from cython_gcd import gcd, gcd_int.

To try the extension, run: python setup.py build_ext --inplace, where setup.py:

from distutils.core import setup
from distutils.extension import Extension
from Cython.Distutils import build_ext

ext_modules = [Extension("cython_gcd", ["cython_gcd.pyx"])]

setup(
  name = 'Hello world app',
  cmdclass = {'build_ext': build_ext},
  ext_modules = ext_modules
)

To install the extension globally, run python setup.py install.

answered Oct 05 '22 02:10

jfs

Related questions
                            
                                How do I check for valid Git branch names?
                            
                                Python sort() first element of list
                            
                                ImportError: No module named crispy-forms
                            
                                Conditional compilation in Python
                            
                                What are the important language features (idioms) of Python to learn early on [duplicate]
                            
                                Install Python 2.6 without using installer on Win32
                            
                                Django: WSGIRequest' object has no attribute 'user' on some pages?
                            
                                Scrapy Shell - How to change USER_AGENT
                            
                                Iterating over dict values
                            
                                Parse a single CSV string?
                            
                                How to find the largest number(s) in a list of elements, possibly non-unique?
                            
                                Finding points on a rectangle at a given angle
                            
                                Converting a django ValuesQuerySet to a json object
                            
                                Dynamically assigning function implementation in Python
                            
                                How to add additional column to Django QuerySet
                            
                                How to write a simple callback function?
                            
                                Couldn't find that process type, Heroku
                            
                                How to extract text from pdf in Python 3.7
                            
                                Getting SVN revision number into a program automatically
                            
                                Check if Session Key is set

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With