Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python find numbers between range in list or array

I have a list with millions of numbers which are always increasing to the end, I need to find and return numbers within a specified range e.g. numbers greater than X but less than Y, the numbers in the list can change and the values I'm searching for change as well

I have been using this method, please note this is a basic example the numbers are not uniform or the same as shown below in my program

l = [i for i in range(2000000)]
nums = []
for element in l:
    if element > 950004:
        break
    if element > 950000:
        nums.append(element)
#[950001, 950002, 950003, 950004]

Although fast, I kind of need it to be a bit faster for what my program is doing, the numbers change a lot so I'm wondering if there's a better way to do this with a pandas series or a numpy array? but so far all I've done is make an example in numpy:

a = numpy.array(l,dtype=numpy.int64)

Would a pandas series be more functional? Making use of query()? what would be the best way to approach this with an array as opposed to a python list of python objects

like image 662
ragardner Avatar asked Oct 27 '17 16:10

ragardner


People also ask

Is Range () a list?

This is because range() returns a range object. This range object is an object that stores a range of numbers. It is thus not a list or tuple and has nothing to do with them. The range object is iterable, though.

How do you check if a range of values is in a list Python?

To check if given number is in a range, use Python if statement with in keyword as shown below. number in range() expression returns a boolean value: True if number is present in the range(), False if number is not present in the range.

How do you make a list of numbers between two values in Python?

Use range() to create a list of numbers between two values. Call range(start, stop) to construct a sequence of numbers. The sequence ranges from start up to but not including stop . Use list() to convert this sequence to a list.


2 Answers

Here is a solution using binary search. You are speaking of millions of numbers. Technically binary search will make the algorithm faster by decreasing the runtime complexity to O(log n) neglecting the final slicing step.

import bisect

l = [i for i in range(2000000)]
lower_bound = 950000
upper_bound = 950004

lower_bound_i = bisect.bisect_left(l, lower_bound)
upper_bound_i = bisect.bisect_right(l, upper_bound, lo=lower_bound_i)
nums = l[lower_bound_i:upper_bound_i]
like image 191
Calculator Avatar answered Sep 19 '22 13:09

Calculator


The following are two implementations for binary search (based on code from here) - one which searches for an upper limit and one which searches for a lower limit. Does this work better for you?

def binary_search_upper(seq, limit):
    min = 0
    max = len(seq) - 1
    while True:
        if max < min:
            return -1
        m = (min + max) / 2
        if m == (len(seq) -1) or (seq[m] <= limit and seq[m+1] > limit):
            return m
        elif seq[m] < limit:
            min = m+1
        else:
            max = m - 1

def binary_search_lower(seq, limit):
    min = 0
    max = len(seq) - 1
    while True:
        if max < min:
            return -1
        m = (min + max) / 2
        if m == 0 or (seq[m] >= limit and seq[m-1] < limit):
            return m
        elif seq[m] < limit:
            min = m+1
        else:
            max = m - 1


l = [i for i in range(2000000)]
print binary_search_upper(l, 950004)
print binary_search_lower(l, 950000)
like image 40
Maor Veitsman Avatar answered Sep 21 '22 13:09

Maor Veitsman