Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

More elegant way of find a range of repeating elements

Tags:

python

I have this problem.

let l be a list containing only 0's and 1's, find all tuples that represents the start and end of a repeating sequence of 1's. example

l=[1,1,0,0,0,1,1,1,0,1]

answer:

[(0,2),(5,8),(9,10)]

i solved the problem with the following code, but i think it is pretty messy, i would like to know if there is a cleaner way to solve this problem (maybe using map/reduce ?)

from collections import deque
def find_range(l):
    pairs=deque((i,i+1) for i,e in enumerate(l) if e==1)
    ans=[]
    p=[0,0]
    while(len(pairs)>1):
        act=pairs.popleft()
        nex=pairs[0]
        if p==[0,0]:
            p=list(act)
        if act[1]==nex[0]:
            p[1]=nex[1]
        else:
            ans.append(tuple(p))
            p=[0,0]
    if(len(pairs)==1):
        if p==[0,0]:
            ans.append(pairs.pop())
        else:
            ans.append((p[0],pairs.pop()[1]))
    return ans
like image 689
giuliano-oliveira Avatar asked Nov 30 '19 20:11

giuliano-oliveira


1 Answers

With itertools.groupby magic:

from itertools import groupby

lst = [1, 1, 0, 0, 0, 1, 1, 1, 0, 1]
indices, res = range(len(lst)), []
for k, group in groupby(indices, key=lambda i: lst[i]):
    if k == 1:
        group = list(group)
        sl = group[0], group[-1] + 1
        res.append(sl)
print(res)

The output:

[(0, 2), (5, 8), (9, 10)]

Or with a more efficient generator function:

def get_ones_coords(lst):
    indices = range(len(lst))
    for k, group in groupby(indices, key=lambda i: lst[i]):
        if k == 1:
            group = list(group)
            yield group[0], group[-1] + 1

lst = [1, 1, 0, 0, 0, 1, 1, 1, 0, 1]
print(list(get_ones_coords(lst)))   # [(0, 2), (5, 8), (9, 10)]

As a short bonus, here's alternative numpy approach, though sophisticated, based on discrete difference between consecutive numbers (numpy.diff) and extracting indices of non-zero items (numpy.faltnonzero):

In [137]: lst = [1,1,0,0,0,1,1,1,0,1]                                                                                        

In [138]: arr = np.array(lst)                                                                                                

In [139]: np.flatnonzero(np.diff(np.r_[0, arr, 0]) != 0).reshape(-1, 2)                                                      
Out[139]: 
array([[ 0,  2],
       [ 5,  8],
       [ 9, 10]])
like image 175
RomanPerekhrest Avatar answered Oct 05 '22 20:10

RomanPerekhrest