I have a following dataframe:
Joined User ID
0 2017-08-19 user 182737081
1 2017-05-07 user 227151009
2 2017-11-29 user 227306568
3 2016-05-22 user 13661634
4 2017-01-23 user 220545735
I'm trying to figure out how to plot user growth over time. I figured the best way is to plot a cumulative sum. I put together a simple code:
tmp = members[['Joined']].copy()
tmp['count'] = 1
tmp.set_index('Joined', inplace=True)
This produces the following cumsum
:
count
Joined
2017-08-19 1
2017-05-07 2
2017-11-29 3
2016-05-22 4
2017-01-23 5
Now when I try to plot this using tmp.plot()
I get something super weird like this, uh:
The version of pandas I'm using: pandas (0.20.3)
In case you are curious whether the length of the series is the same as the highest count:
tmp.cumsum().max() == len(tmp)
count True
dtype: bool
Seems like you need sort_index
, then cumsum
, then plot
#tmp.index=pd.to_datetime(tmp.index)
tmp.sort_index().cumsum().plot()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With