I want to calculate the value change by group.
This is the python pandas dataframe df I have:
Group | Date | Value
A 01-02-2016 16
A 01-03-2016 15
A 01-04-2016 14
A 01-05-2016 17
A 01-06-2016 19
A 01-07-2016 20
B 01-02-2016 16
B 01-03-2016 13
B 01-04-2016 13
C 01-02-2016 16
C 01-03-2016 16
I want to calculate that for Group A, the values are going up, for Group B they are going down and for Group C they are not changing.
I am not sure how to approach it, since in Group A the values initially decrease and then increase. So should I look at the average change or most recent change?
Should I use pct_change? http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.pct_change.html I was not sure how to specify the timeframe fot that.
df.groupby.pct_change
It would be great if I could visualize it too. Any advice or hint is greatly appreciated! Thank you
use pct_change
in a groupby
d1 = df.set_index(['Date', 'Group']).Value
d2 = d1.groupby(level='Group').pct_change()
print(d2)
Date Group
2016-01-02 A NaN
2016-01-03 A -0.062500
2016-01-04 A -0.066667
2016-01-05 A 0.214286
2016-01-06 A 0.117647
2016-01-07 A 0.052632
2016-01-02 B NaN
2016-01-03 B -0.187500
2016-01-04 B 0.000000
2016-01-02 C NaN
2016-01-03 C 0.000000
Name: Value, dtype: float64
One of many ways to visualize and compare is to see how they grow. In this case, I'd
fillna(0)
add(1)
cumprod()
d2.fillna(0).add(1).cumprod().unstack().plot()
setup
from io import StringIO
import pandas as pd
txt = """Group Date Value
A 01-02-2016 16
A 01-03-2016 15
A 01-04-2016 14
A 01-05-2016 17
A 01-06-2016 19
A 01-07-2016 20
B 01-02-2016 16
B 01-03-2016 13
B 01-04-2016 13
C 01-02-2016 16
C 01-03-2016 16 """
df = pd.read_clipboard(parse_dates=[1])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With