I am working on a code that takes hourly data for a month and groups it into 24 hour sums. My problem is that I would like the index to read the date/year and I am just getting an index of 1-30.
The code I am using is
df = df.iloc[:,16:27].groupby([lambda x: x.day]).sum()
example of output I am getting
DateTime data
1 1772.031568
2 19884.42243
3 28696.72159
4 24906.20355
5 9059.120325
example of output I would like
DateTime data
1/1/2017 1772.031568
1/2/2017 19884.42243
1/3/2017 28696.72159
1/4/2017 24906.20355
1/5/2017 9059.120325
Resample Hourly Data to Daily Dataresample() method. To aggregate or temporal resample the data for a time period, you can take all of the values for each day and summarize them. In this case, you want total daily rainfall, so you will use the resample() method together with . sum() .
A Timestamp object in pandas is an equivalent of Python's datetime object. It is a combination of date and time fields. To combine date and time into a Timestamp object, we use the Timestamp. combine() function in pandas .
Pandas has a built-in function called to_datetime()that converts date and time in string format to a DateTime object. As you can see, the 'date' column in the DataFrame is currently of a string-type object. Thus, to_datetime() converts the column to a series of the appropriate datetime64 dtype.
This is an old question, but I don't think the accepted solution is the best in this particular case. What you want to accomplish is to down sample time series data, and Pandas has built-in functionality for this called resample(). For your example you will do:
df = df.iloc[:,16:27].resample('D').sum()
or if the datetime column is not the index
df = df.iloc[:,16:27].resample('D', on='datetime_column_name').sum()
There are (at least) 2 benefits from doing it this way as opposed to accepted answer:
For more information and examples, see documentation here: resample()
If your index is a datetime
, you can build a combined groupby
clause:
df = df.iloc[:,16:27].groupby([lambda x: "{}/{}/{}".format(x.day, x.month, x.year)]).sum()
or even better:
df = df.iloc[:,16:27].groupby([lambda x: x.strftime("%d%m%Y")]).sum()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With