Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

String performance - Python 2.7 vs Python 3.4 under Windows 10 vs. Ubuntu

Use case
A simple function which checks if a specific string is in another string at a position which is a multiple of 3 (see here for a real world example, finding stop codons in a DNA sequence).

Functions
sliding_window: takes a string of length 3 compares it to the search string, if they are identical moves 3 characters forward.
incremental_start: tries to find the search string, if the found position is not a multiple of 3, it tries to find the next position after the found position.

Please note: The sample data is just to make sure that each function has to go through the complete string, the performance is similar with real data or random data.

Results

  • Python 2.7: The initial sliding_window function could be improved by a factor of ~39 by using the function incremental_start in Python2.7 on Windows 10. There was slight drop in the performance improvement on Ubuntu, ~34x, ~37x, ~18x (VM, AWS, native), but still in the same range.
  • Python 3.4: sliding_window became slower than in Python2.7 (1.8x on Windows, 1.4x resp. 1.5x on all Ubuntus), but the incremental_start performance dropped on all Ubuntus by a factor of 4, 5, 1.7 (VM, AWS, native), while it hardly changed on Windows.
  • Windows vs Ubuntu
    Python2.7: virtualized Ubuntus needed less time for both functions(~20-30%), native Ubuntu was about 25% slower for the incremental_start, while sliding_window was 40% faster.
    Python3: the sliding_window function needed less time to finish (~50%), while the incremental_start became slower by a factor of ~2-3.

Questions

  • What causes the performance difference in Python 2 vs. Python 3 on Linux vs. Windows?
  • How is it possible to anticipate such a behavior and adjust the code for optimal performance?

Code

import timeit

text = 'ATG' * 10**6
word = 'TAG'

def sliding_window(text, word):
    for pos in range(0, len(text), 3):
        if text[pos:pos + 3] == word:
            return False
    return True

def incremental_start(text, word):
    start = 0
    while start != -1:
        start = text.find(word, start + 1)
        if start % 3 == 0:
            return False
    return True

#sliding window
time = timeit.Timer(lambda: sliding_window(text, word), setup='from __main__ import text, word').timeit(number=10)
print('%3.3f' % time)

#incremental start
time = timeit.Timer(lambda: incremental_start(text, word), setup='from __main__ import text, word').timeit(number=500)
print('%3.3f' % time)

Tables

Ubuntu vs Windows    VM     AWS    Native   
Python2.7-Increment  79%    73%    126% 
Python2.7-Sliding    70%    70%    60%                  
Python3.4-Increment  307%   346%   201% 
Python3.4-Sliding    54%    59%    48%  

Py2 vs 3    Windows    VM    AWS    Native
Increment   105%       409%  501%   168%
Sliding     184%       143%  155%   147%

Absolute times in seconds
                 Win10   Ubuntu  AWS     Native
Py2.7-Increment  1.759   1.391   1.279   2.215 
Py2.7-Sliding    1.361   0.955   0.958   0.823 

Py3.4-Increment  1.853   5.692   6.406   3.722 
Py3.4-Sliding    2.507   1.365   1.482   1.214 

Details
Windows 10: Native Windows, 32bit Python 3.4.3 or 2.7.9, i5-2500, 16GB RAM
Ubuntu virtual machine: 14.04, Running on the Windows host, 64bit Python 3.4.3, Python 2.7.6, 4 cores, 4GB RAM
AWS: 14.04, AWS micro instance, 64bit Python 3.4.3, Python 2.7.6
Native Ubuntu: 14.04, 64bit Python 3.4.3, Python 2.7.6, i5-2500, 16GB ram [identical to Win10 machine]


Update

As suggested by Ingaz xrange and bytes were used, slight improvement in performance but still massive drop in performance on Ubuntu with Python3.4. The culprit seems to be find which is much slower when Ubuntu and Py3.4 are combined (same with Py3.5 which compiled was from source). This seems to Linux flavor dependent, on Debian Py2.7 and Py3.4 performed identical, on RedHat Py2.7 was considerably faster than Py3.4.
For better comparison Py3.4 is now used in 64bit on Windows10 and Ubuntu. Py27 is still used on Win10.

import timeit, sys

if sys.version_info >= (3,0):
    from builtins import range as xrange

def sliding_window(text, word):
    for pos in range(0, len(text), 3):
        if text[pos:pos + 3] == word:
            return False
    return True

def xsliding_window(text, word):
    for pos in xrange(0, len(text), 3):
        if text[pos:pos + 3] == word:
            return False
    return True

def incremental_start(text, word):
    start = 0
    while start != -1:
        start = text.find(word, start + 1)
        if start % 3 == 0:
            return False
    return True

text = 'aaa' * 10**6
word = 'aaA'
byte_text = b'aaa' * 10**6
byte_word = b'aaA'

time = timeit.Timer(lambda: sliding_window(text, word), setup='from __main__ import text, word').timeit(number=10)
print('Sliding, regular:      %3.3f' % time)

time = timeit.Timer(lambda: incremental_start(text, word), setup='from __main__ import text, word').timeit(number=500)
print('Incremental, regular:  %3.3f' % time)

time = timeit.Timer(lambda: sliding_window(byte_text, byte_word), setup='from __main__ import byte_text, byte_word').timeit(number=10)
print('Sliding, byte string:  %3.3f' % time)

time = timeit.Timer(lambda: incremental_start(byte_text, byte_word), setup='from __main__ import byte_text, byte_word').timeit(number=500)
print('Incremental, bytes:    %3.3f' % time)

time = timeit.Timer(lambda: xsliding_window(byte_text, byte_word), setup='from __main__ import byte_text, byte_word').timeit(number=10)
print('Sliding, xrange&bytes: %3.3f' % time)

time = timeit.Timer(lambda: text.find(word), setup='from __main__ import text, word').timeit(number=1000)
print('simple find in string: %3.3f' % time)


Win10-py27  Wi10-py35   VM-py27  VM-py34
1.440       2.674       0.993    1.368 
1.864       1.425       1.436    5.711 
1.439       2.388       1.048    1.219 
1.887       1.405       1.429    5.750 
1.332       2.356       0.772    1.224 
3.756       2.811       2.818    11.361 
like image 827
Maximilian Peters Avatar asked May 05 '16 13:05

Maximilian Peters


1 Answers

Although you are measuring speed of the same code, the structures in your code are different.

A. range in 2.7 is type 'list', range in 3.4 is class 'range'

B. 'ATG' * 10**6 in 2.7 is a bytes string and in 3.4 it's and unicode string

You can try to produce more compatible results if: a) use xrange for 2.7 variant, b) use bytes string in both examples: b'ATG' or unicode strings in both examples.

Update

I suspected that difference in performance stems from main factors: a) 32bit vs 64bit, b) C compiler.

So, I did tests for:

  1. ActiveState Python 2.7.10 32bit
  2. ActiveState Python 2.7.10 64bit
  3. Official distribution Python 2.7.11 32bit
  4. Official distribution Python 2.7.11 64bit
  5. Python 2.7.6 64bit on Ubuntu on Windows 10
  6. pypy-5.1.1-win32

What I expected

I expected that:

  • 64bit version will be slower
  • ActiveState will be a little faster
  • PyPy faster by magnitude
  • Ubuntu on Windows 10 - ???

Results

Test                    as32b   as64b   off32b   off64b  ubw64b  pypy5.1.1
Sliding, regular:       1.232   1.230   1.281    1.136   0.951   0.099  
Incremental, regular:   1.744   1.690   2.219    1.647   1.472   2.772
Sliding, byte string:   1.223   1.207   1.280    1.127   0.926   0.101
Incremental, bytes:     1.720   1.701   2.206    1.646   1.568   2.774
Sliding, xrange&bytes:  1.117   1.102   1.162    0.962   0.779   0.109
simple find in string:  3.443   3.412   4.607    3.300   2.487   0.289

And the winner on Windows 10 is .... Ubuntu Python compiled by GCC 4.8.2 for Linux!

This result was completely unexpected for me.

32 vs 64: turned irrelevant.

PyPy: as always megafast, except cases when it's not.

I can't interprete this results, OP question turned not so simple as it seemed.

like image 76
Alex Yu Avatar answered Nov 08 '22 14:11

Alex Yu