Why does python's built in binary search function run so much faster?

Tags:

I've written a binary search algorithm in python, that more or less follows the same structure as the bisect_left function found in the bisect module. In fact it has a couple less conditionals as I know that the high point will be the length of the list and the low will be 0. Yet for some reason the built in function runs 5 times as fast as mine.

My code is as follows:

def bisection_search(word, t):

    high = len(t)
    low = 0

    while low < high:
        half = (high+low)/2
        if t[half] < word:
            low = half + 1
        else:
            high = half
    return low

The source code for the built in function is:

def bisect_left(a, x, lo=0, hi=None):
    if lo < 0:
        raise ValueError('lo must be non-negative')
    if hi is None:
        hi = len(a)
    while lo < hi:
        mid = (lo+hi)//2
        if a[mid] < x: lo = mid+1
        else: hi = mid
    return lo

As you can see, virtually identical. However the timed out put for the my function (searching for the last term in an ordered list of 100,000 words) is -3.60012054443e-05, where as the built in achieves -6.91413879395e-06. What explains this difference?

In the source code there is a comment at the end that says "Overwrite above definitions with a fast C implementation" - is this what explains the difference? If so, how would I go about creating such a precompiled module?

Any advice is greatly appreciated.

290

asked Feb 22 '14 01:02

advert2013

1 Answers

To summarise the remarks above so the question can be closed, the reason the built in module is faster is because the modules are precompiled in c. There are basically two options to attempt to replicate such performance, one is to use a JIT compiler like PyPy where the compilation is done at run time, the other is to compile your own modules in C, using Cython or some other variant to integrate the C code with python. The link from sharth above to the c code for bisect is particularly helpful and can be found here. Thanks again for all the help.

132

answered Oct 05 '22 11:10

advert2013

Related questions
                            
                                Faster matrix power than numpy?
                            
                                Prevent sub-section nesting in Python Sphinx when using toctree
                            
                                Output of cv2.findHomography in OpenCV (Python)
                            
                                AWS EMR perform "bootstrap" script on all the already running machines in cluster
                            
                                How to handle python packages with conflicting names?
                            
                                How do I output a colormap in a scene using pyqt?
                            
                                python is not installing dependencies listed in install_requires of setuptools
                            
                                Add line to pandas plot
                            
                                What is the difference between OneVsRestClassifier with SVC and SVC with decision_function_shape='ovr'?
                            
                                Embed an interactive Bokeh in django views
                            
                                Fatal Python error: initfsencoding: unable to load the file system codec
                            
                                Python: Mock a module without importing it or needing it to exist
                            
                                numpy on multicore hardware
                            
                                How to reliably generate Ethernet frame errors in software?
                            
                                PyGObject in Python 3 on windows
                            
                                Use time elapsed as assertion in unit tests
                            
                                How do I document classes without the module name?
                            
                                Sparse Matrix in Numba
                            
                                (list|set|dict) comprehension containing a yield expression does not return a (list|set|dict)
                            
                                How do you actually use a reusable django app in a project?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why does python's built in binary search function run so much faster?

Tags:

python

binary-search

advert2013

People also ask

1 Answers

advert2013

Recent Activity

Donate For Us