I have a list which can contain both None
s and datetime
objects. I need to split this in sublists of consecutive datetime
objects and need to record the index of the first datetime
object of this sublist in the original list.
E.g., I need to be able to turn
original = [None, datetime(2013, 6, 4), datetime(2014, 5, 12), None, None, datetime(2012, 5, 18), None]
into:
(1, [datetime.datetime(2013, 6, 4, 0, 0), datetime.datetime(2014, 5, 12, 0, 0)])
(5, [datetime.datetime(2012, 5, 18, 0, 0)])
I have tried two approaches. One using find
:
binary = ''.join('1' if d else '0' for d in original)
end = 0
start = binary.find('1', end)
while start > -1:
end = binary.find('0', start)
if end < 0:
end = len(binary)
dates = original[start:end]
print (start, dates)
start = binary.find('1', end)
and one using groupby
:
from itertools import groupby
for key, group in groupby(enumerate(original), lambda x: x[1] is not None):
if key:
group = list(group)
start = group[0][0]
dates = [t[1] for t in group]
print (start, dates)
But both don't seem overly Pythonic to me. Is there a better way?
I'd use a generator to produce the elements, encapsulating the grouping:
from itertools import takewhile
def indexed_date_groups(it):
indexed = enumerate(it)
for i, elem in indexed:
if elem is not None:
yield (
i, [elem] + [v for i, v in takewhile(
lambda v: v[1] is not None, indexed)])
Here I used itertools.takewhile()
to produce the sublist once we find an initial not-None
object.
You can do the same with itertools.groupby()
still, of course:
from itertools import groupby
def indexed_date_groups(it):
for key, group in groupby(enumerate(it), lambda v: v[1] is not None):
if key:
indices, elems = zip(*group)
yield indices[0], elems
Demo:
>>> list(indexed_date_groups(original))
[(1, [datetime.datetime(2013, 6, 4, 0, 0), datetime.datetime(2014, 5, 12, 0, 0)]), (5, [datetime.datetime(2012, 5, 18, 0, 0)])]
>>> original = [None, datetime(2013, 6, 4), datetime(2014, 5, 12), None, None, datetime(2012, 5, 18), None]
>>> for index, group in indexed_date_groups(original):
... print index, group
...
1 [datetime.datetime(2013, 6, 4, 0, 0), datetime.datetime(2014, 5, 12, 0, 0)]
5 [datetime.datetime(2012, 5, 18, 0, 0)]
from itertools import groupby, count
idx = count()
for key, group in groupby(original, lambda x: x is not None):
indices, group = zip(*((next(idx), i) for i in group))
if key:
print (indices[0], group)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With