Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does python's built in binary search function run so much faster?

(Already answered by sharth's comment.)

I've written a binary search algorithm in python, that more or less follows the same structure as the bisect_left function found in the bisect module. In fact it has a couple less conditionals as I know that the high point will be the length of the list and the low will be 0. Yet for some reason the built in function runs 5 times as fast as mine.

My code is as follows:

def bisection_search(word, t):

    high = len(t)
    low = 0

    while low < high:
        half = (high+low)/2
        if t[half] < word:
            low = half + 1
        else:
            high = half
    return low

The source code for the built in function is:

def bisect_left(a, x, lo=0, hi=None):
    if lo < 0:
        raise ValueError('lo must be non-negative')
    if hi is None:
        hi = len(a)
    while lo < hi:
        mid = (lo+hi)//2
        if a[mid] < x: lo = mid+1
        else: hi = mid
    return lo

As you can see, virtually identical. However the timed out put for the my function (searching for the last term in an ordered list of 100,000 words) is -3.60012054443e-05, where as the built in achieves -6.91413879395e-06. What explains this difference?

In the source code there is a comment at the end that says "Overwrite above definitions with a fast C implementation" - is this what explains the difference? If so, how would I go about creating such a precompiled module?

Any advice is greatly appreciated.

like image 290
advert2013 Avatar asked Feb 22 '14 01:02

advert2013


People also ask

Is binary search the fastest?

Binary search is widely used and one of the fastest search algorithms. It works based on the divide and search principle. The data set must be sorted in increasing or decreasing order before providing it as input to the binary search algorithm.

Does Python have binary search built in?

While there's no explicit binary search algorithm in Python, there is a module - bisect - designed to find the insertion point for an element in a sorted list using a binary search.

Why is binary search efficient?

Binary search is an efficient algorithm for finding an item from a sorted list of items. It works by repeatedly dividing in half the portion of the list that could contain the item, until you've narrowed down the possible locations to just one.

How does binary search in Python work?

Binary search compares the target value to the middle element of the array. If they are not equal, the half in which the target cannot lie is eliminated, and the search continues on the remaining half, again taking the middle element to compare to the target value and repeating this until the target value is found.


1 Answers

To summarise the remarks above so the question can be closed, the reason the built in module is faster is because the modules are precompiled in c. There are basically two options to attempt to replicate such performance, one is to use a JIT compiler like PyPy where the compilation is done at run time, the other is to compile your own modules in C, using Cython or some other variant to integrate the C code with python. The link from sharth above to the c code for bisect is particularly helpful and can be found here. Thanks again for all the help.

like image 132
advert2013 Avatar answered Oct 05 '22 11:10

advert2013