Have a time series(ts) indexed by DatatimeIndex, want to group it by 10 minutes
index x y z
ts1 ....
ts2 ....
...
I know how to group by 1 minute
def group_by_minute(timestamp):
year = timestamp.year
month = timestamp.month
day = timestamp.day
hour = timestamp.hour
minute = timestamp.minute
return datetime.datetime(year, month, day, hour, minute)
then
ts.groupby(group_by_minute, axis=0)
my customized function (roughly)
def my_function(group):
first_latitude = group['latitude'].sort_index().head(1).values[0]
last_longitude = group['longitude'].sort_index().tail(1).values[0]
return first_latitude - last_longitude
so the ts DataFrame should definitely contains 'latitude' and 'longitude' columns
When using TimeGrouper
ts.groupby(pd.TimeGrouper(freq='100min')).apply(my_function)
I got the following errors,
TypeError: cannot concatenate a non-NDFrame object
groupby() function is used to split the data into groups based on some criteria. pandas objects can be split on any of their axes. The abstract definition of grouping is to provide a mapping of labels to group names.
Dates and Times in Python The Python world has a number of available representations of dates, times, deltas, and timespans. While the time series tools provided by Pandas tend to be the most useful for data science applications, it is helpful to see their relationship to other packages used in Python.
Although Groupby is much faster than Pandas GroupBy. apply and GroupBy. transform with user-defined functions, Pandas is much faster with common functions like mean and sum because they are implemented in Cython.
There is a pandas.TimeGrouper
for this sort of thing, what you described would be some thing like:
agg_10m = df.groupby(pd.TimeGrouper(freq='10Min')).aggregate(numpy.sum) #or other function
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With