I have a pandas dataframe which has a datetime column. I'm grouping by day and then hour using the following:
df.groupby([df['date'].map(lambda t: t.day), df['date'].map(lambda t: t.hour)]).count()
Unfortunately, this leaves me with a double index, both called date. The first date is the day of the month, the second date is the hour, bytes is the count of items in that hour:
I'm trying to utilize these date columns but can't. I've tried reseting the index, but receive this error:
ValueError: cannot insert date, already exists
I also can't rename the columns because "date" doesn't appear in the columns list:
grouped_df.columns
>> Index([u'bytes'], dtype='object')
Ultimately, I'm trying to find a count of number of items in each hour of each day. How can I rename the duplicate date columns? Should I be grouping the dataframe using a different method to avoid this dilemma?
I did't test but something like this should work:
df.groupby([df['date'].rename("day").map(lambda t: t.day), df['date'].rename("hour").map(lambda t: t.hour)]).count()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With