I have a DataFrame with two columns. One of them is containing timestamps and another one - id of some action. Something like that:
2000-12-29 00:10:00 action1
2000-12-29 00:20:00 action2
2000-12-29 00:30:00 action2
2000-12-29 00:40:00 action1
2000-12-29 00:50:00 action1
...
2000-12-31 00:10:00 action1
2000-12-31 00:20:00 action2
2000-12-31 00:30:00 action2
I would like to know how many actions of certain type have been performed in a certain day. I.e. for every day, I need to count the number of occurrences of actionX and plot this data with date on X axis and number of occurrences of actionX on Y axes, for each date.
Of course I can count actions for each day naively just by iterating through my dataset. But what's the "right way" to do in with pandas/matplotlib?
Using the size() or count() method with pandas. DataFrame. groupby() will generate the count of a number of occurrences of data present in a particular column of the dataframe.
Pandas DataFrame count() Method The count() method counts the number of not empty values for each row, or column if you specify the axis parameter as axis='columns' , and returns a Series object with the result for each row (or column).
You can get the counts by using
df.groupby([df.index.date, 'action']).count()
or you can plot directly using this method
df.groupby([df.index.date, 'action']).count().plot(kind='bar')
You could also just store the results to count
and then plot it separately. This is assuming that your index is already in datetimeindex format, otherwise follow the directions of @mkln above.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With