Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Grouping of element in a list given a list of intervals

I have two lists of elements that look like

a=[['10', 'name_1'],['50','name_2'],['40','name_3'], ..., ['80', 'name_N']]
b=[(10,40),(40,60),(60,90),(90,100)]

a contains a set of data, and b defines some intervals, my aim is to create a list c with as many list as the intervals in b. Each list in c contains all the x elements in a for which x[0] is contained in the interval. Ex:

c=[
[['10', 'name_1']],
[['50','name_2'],['40','name_3']],
[...,['80', 'name_N']]
]
like image 617
fady Avatar asked Nov 02 '22 19:11

fady


1 Answers

You can use collections.defaultdict and bisect module here:

As the ranges are continuous so it would be better to convert the list b into something like this first:

[10, 40, 60, 90, 100]

The advantage of this is that we can now use bisect module to find the index where the items from a list can fit in. For example 50 will come between 40 and 60 so bisect.bisect_right will return 2 in this case. No we can use this 2 as key and stores the list as it's value. This way we can group those items based on the index returned from bisect.bisect_right.

L_b = 2* len(b)
L_a = len(a)
L_b1 = len(b1)

The overall complexity is going to be : max ( L_b log L_b , L_a log L_b1 )

>>> import bisect
>>> from collections import defaultdict
>>> b=[(10,40),(40,60),(60,90),(90,100)]
>>> b1 = sorted( set(z for x in b for z in x))
>>> b1
[10, 40, 60, 90, 100]
>>> dic = defaultdict(list)
for x,y in a:
    #Now  find the index where the value from the list can fit in the 
    #b1 list, bisect uses binary search so this is an O(log n ) step.
    # use this returned index as key and append the list to that key.
    ind = bisect.bisect_right(b1,int(x))
    dic[ind].append([x,y])
...     
>>> dic.values()
[[['10', 'name_1']], [['50', 'name_2'], ['40', 'name_3']], [['80', 'name_N']]]

As dicts don't have any specific order use sorting to get a sorted output:

>>> [dic[k] for k in sorted(dic)]
[[['10', 'name_1']], [['50', 'name_2'], ['40', 'name_3']], [['80', 'name_N']]]
like image 112
Ashwini Chaudhary Avatar answered Nov 15 '22 04:11

Ashwini Chaudhary