Computing percentage difference between pandas dataframe rows

Tags:

pandas

region  year      val
1.0     2015.0    6.775457e+05
1.0     2016.0    6.819761e+05
1.0     2017.0    6.864065e+05
2.0     2015.0    6.175457e+05
2.0     2016.0    6.419761e+05
3.0     2017.0    6.564065e+05

In the dataframe above, I want to compute the percentage difference between consecutive rows but only for the same region values. I tried this but not sure if it works. What is best way to achieve it?

df.groupby(['region', 'year'])['val'].pct_change()

738

asked Aug 15 '17 05:08

user308827

1 Answers

You can use DataFrameGroupBy.pct_change with groupby by column region:

df['new'] = df.groupby('region')['val'].pct_change()
print (df)
   region    year       val       new
0     1.0  2015.0  677545.7       NaN
1     1.0  2016.0  681976.1  0.006539
2     1.0  2017.0  686406.5  0.006496
3     2.0  2015.0  617545.7       NaN
4     2.0  2016.0  641976.1  0.039560
5     3.0  2017.0  656406.5       NaN

140

answered Nov 15 '22 01:11

jezrael

Related questions
                            
                                Keras Training warm_start
                            
                                Trying to understand python memory profiler
                            
                                Python look-behind regex "fixed-width pattern" error while looking for consecutive repeated words
                            
                                How to create a MR using GitLab API?
                            
                                PyDrive: Create a Google Doc file
                            
                                Tensorflow vs Numpy math functions
                            
                                Print/Save autoencoder generated features in Keras
                            
                                Catch http-status code in Flask
                            
                                How can I get request headers with python hug
                            
                                Trouble running Python script CRON: Import Error: No Module Named Tweepy
                            
                                How to merge two data frames while excluding the NaN value column?
                            
                                How to mock a protected/private method in a tested method?
                            
                                compare a datetime column only to time in pandas
                            
                                Access list elements that are not equal to a specific value
                            
                                Pandas: How to assign sum() or mean() to df.groupby inside a function?
                            
                                TypeError: Mismatch between array dtype ('float64') and format specifier
                            
                                Pyomo ValueError: Invalid constraint expression
                            
                                Is it possible to solve equations of bit wise operators?
                            
                                How to interpret Sklearn LDA perplexity score. Why it always increase as number of topics increase?
                            
                                Python: How to code an exponential moving average?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With