I have a list of events that occur at mS accurate intervals, that spans a few days. I want to cluster all the events that occur in a 'per-n-minutes' slot (can be twenty events, can be no events). I have a datetime.datetime
item for each event, so I can get datetime.datetime.minute
without any trouble.
My list of events is sorted in time order, earliest first, latest last. The list is complete for the time period I am working on.
The idea being that I can change list:-
[[a],[b],[c],[d],[e],[f],[g],[h],[i]...]
where a, b, c, occur between mins 0 and 29, d,e,f,g occur between mins 30 and 59, nothing between 0 and 29 (next hour), h, i between 30 and 59 ...
into a new list:-
[[[a],[b],[c]],[[d],[e],[f],[g]],[],[[h],[i]]...]
I'm not sure how to build an iterator that loops through the two time slots until the time series list ends. Anything I can think of using xrange
stops once it completes, so I wondered if there was a way of using `while' to do the slicing?
I also will be using a smaller timeslot, probably 5 mins, I used 30mins as a shorter example for demonstration.
(for context, I'm making a geo plotted time based view of the recent quakes in New Zealand. and want to show all the quakes that occurs in a small block of time in one step to speed up the replay)
# create sample data
from datetime import datetime, timedelta
d = datetime.now()
data = [d + timedelta(minutes=i) for i in xrange(100)]
# prepare and group the data
from itertools import groupby
def get_key(d):
# group by 30 minutes
k = d + timedelta(minutes=-(d.minute % 30))
return datetime(k.year, k.month, k.day, k.hour, k.minute, 0)
g = groupby(sorted(data), key=get_key)
# print data
for key, items in g:
print key
for item in items:
print '-', item
This is a python translation of this answer, which works by rounding the datetime to the next boundary and use that for grouping.
If you really need the possible empty groups, you can just add them by using this or a similar method:
def add_missing_empty_frames(g):
last_key = None
for key, items in g:
if last_key:
while (key-last_key).seconds > 30*60:
empty_key = last_key + timedelta(minutes=30)
yield (empty_key, [])
last_key = empty_key
yield (key, items)
last_key = key
for key, items in add_missing_empty_frames(g):
...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With