In Python, given a list of sorted integers, I would to group them by consecutive values and tolerate gaps of 1.
For instance, given a list my_list
:
In [66]: my_list
Out[66]: [0, 1, 2, 3, 5, 6, 10, 11, 15, 16, 18, 19, 20]
I would like the following output:
[[0, 1, 2, 3, 5, 6], [10, 11], [15, 16, 18, 19, 20]]
Now, if I didn't have to tolerate gaps of 1, I could apply the neat solution explained here:
import itertools
import operator
results = []
for k, g in itertools.groupby(enumerate(my_list), lambda (i,x):i-x):
group = map(operator.itemgetter(1), g)
results.append(group)
Is there a way to incorporate my extra requirement in the above solution? If not, what's the best way to tackle the problem?
When in doubt you can always write your own generator:
def group_runs(li,tolerance=2):
out = []
last = li[0]
for x in li:
if x-last > tolerance:
yield out
out = []
out.append(x)
last = x
yield out
demo:
list(group_runs(my_list))
Out[48]: [[0, 1, 2, 3, 5, 6], [10, 11], [15, 16, 18, 19, 20]]
Numpy is a very useful tool, and not very difficult to learn.
This problem is solvable in O(n)
with a single line of code (excluding imports, data, and converting to list - if you really need it):
from numpy import array, diff, where, split
l= [0, 1, 2, 3, 5, 6, 10, 11, 15, 16, 18, 19, 20]
result= split(l, where(diff(l)>2)[0]+1 )
print map(list, result)
More importantly, the code is very fast if you need to process large lists, unlike a pure-python solution
Remember, groupby in itself, is pretty lame. The strength of itertools.groupby
is the key. For this particular problem, you need to create an appropriate key with memory (stateless key will not work here).
Implementation
class Key(object):
def __init__(self, diff):
self.diff, self.flag, self.prev = diff, [0,1], None
def __call__(self, elem):
if self.prev and abs(self.prev - elem) > self.diff:
self.flag = self.flag[::-1]
self.prev= elem
return self.flag[0]
Object
[list(g) for k, g in groupby(my_list, key = Key(2))]
[[0, 1, 2, 3, 5, 6], [10, 11], [15, 16, 18, 19, 20]]
How it Works
Every time, a new sub-list needs to be created (curr - prev > threshold
), you toggle a flag. There are different ways to toggle a flag
flag = 1; flag *= -1
flag = [0, 1 ]; flag = flag[::-1]
flag = 0; flag = 0 if flag else 1
Choose what ever your heart contends
So this generates an accompanying key along with your list
[0, 1, 2, 3, 5, 6, 10, 11, 15, 16, 18, 19, 20]
[0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0 , 0]
<------> <------>
Toggle flag Toggle flag
0 -> 1, as 1 -> 0, as
10 - 6 > 2 15 - 11 > 2
Now as itertools.groupby
, groups consecutive elements with same key, all elements with keys having consecutive 0
s or 1
s are grouped together
[0, 1, 2, 3, 5, 6, 10, 11, 15, 16, 18, 19, 20]
[0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0 , 0]
[0, 1, 2, 3, 5, 6], [10, 11], [15, 16, 18, 19, 20]
[0, 0, 0, 0, 0, 0], [1, 1], [0, 0, 0, 0 , 0]
And your final result would be
[0, 1, 2, 3, 5, 6], [10, 11], [15, 16, 18, 19, 20]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With