Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

search for before and after values in a long sorted list

Tags:

python

What would be the fastest way to search for a number (eg. 12.31) in long sorted list and get the values one before and after my "search" value when the exact value isn't found (eg. 11.12 and 12.03 in the list below)?
Many thanks in advance.

long_list = [10.11, 11.12, 13.03, 14.2 .. 12345.67]
like image 976
DGT Avatar asked Jul 08 '11 18:07

DGT


People also ask

How can we search for an element in a sorted list?

The idea is to find the pivot point, divide the array into two sub-arrays and perform a binary search. For a sorted (in increasing order) and rotated array, the pivot element is the only element for which the next element to it is smaller than it. Using binary search based on the above idea, pivot can be found.

Is an efficient algorithm for finding an item from a sorted list?

Binary search is an efficient algorithm for finding an item from a sorted list of items. It works by repeatedly dividing in half the portion of the list that could contain the item, until you've narrowed down the possible locations to just one.

Which algorithm can be used to find an element in a sorted array?

In computer science, binary search, also known as half-interval search, logarithmic search, or binary chop, is a search algorithm that finds the position of a target value within a sorted array. Binary search compares the target value to the middle element of the array.

What is the average time complexity of searching an item in a sorted array of size n?

Elements within a sorted array are found using a binary search, in O(log n); thus sorted arrays are suited for cases when one needs to be able to look up elements quickly, e.g. as a set or multiset data structure. This complexity for lookups is the same as for self-balancing binary search trees.


2 Answers

The fastest is probably to use built-in support in python. Here I'm thinking about the bisect module. Below I'm using a dictionary to quickly check in O(1) if a value is in the list; if not, bisect is used to find values smaller than and larger than the sought value.

#!/usr/bin/env python

import bisect

def find_lt(a, x):
    'Find rightmost value less than x'
    i = bisect.bisect_left(a, x)
    if i:
        return a[i-1]
    raise ValueError

def find_gt(a, x):
    'Find leftmost value greater than x'
    i = bisect.bisect_right(a, x)
    if i != len(a):
        return a[i]
    raise ValueError

# First create a test-list (49996 items)
i=1.0
R=[1.0]
D={}
while i < 10000:
    i+=0.2
    i=round(i,2)
    D[i]=True
    R.append(i)

# Locate a value, in this case 100.3 which is not in the list
x=100.3
if D.has_key(x):
    print "found", x
else:
    print find_lt(R, x)
    print find_gt(R, x)

Output for x=100.3:

100.2
100.4
like image 90
Fredrik Pihl Avatar answered Nov 25 '22 09:11

Fredrik Pihl


Exponential search (AKA galloping search) would perform better than plain binary search if the list is very long. The idea is to scan forward from position 0 on increasing steps until the answer is passed at this point a binary search can be performed to the range formed by the last two steps. If the element is not found then the last attempt will point to the closest elements.

Have a look at Basic Techniques for information retrieval. The pseudo-code algorithm is provided and they discuss its complexity against binary search.

like image 25
Manuel Salvadores Avatar answered Nov 25 '22 10:11

Manuel Salvadores