How to Interpolate between all Values in Two Separate Pandas DataFrames?

Question

Let's assume you have two Pandas DataFrames, one containing data for the year 2020 and the other containing data for the year 2030. Both DataFrames have the same shape, column names, and only contain numeric values. For simplicity, we'll create them as follows:

twenty = pd.DataFrame({'A':[1,1,1], 'B':[3,3,3]})
thirty = pd.DataFrame({'A':[3,3,3], 'B':[7,7,7]})

Now, the goal is to perform a linear interpolation on all values in these DataFrames to obtain a new DataFrame for the year 2025 (or whatever year we select). So, we would want to interpolate between each paired set of values, such as twenty['A'][0] and thirty['A'][0]. If we did this for the target year 2025, the result should be:

twentyfive = pd.DataFrame({'A':[2,2,2],'B':[5,5,5]})

I've attempted to use np.interp; however, that is really intended for interpolation on a given (singular) array as far as I can tell. And I've solved the problem using a more brute-force method of melting the DataFrames, adding year columns, merging them together, and then creating a new column with the interpolated values. It's a bit messy and long-winded.

I feel like there must be a more straight-forward (and optimized) way of performing this task. Any help is appreciated.

Naga kiran · Accepted Answer

You can try of taking average directly, if both have same shape

(thirty + twenty)/2

Out:

Edit : if the dataframes does not have equal shapes, you can try of merging with inner join and groupby columns to take interpolated mean.

df = pd.merge(twenty,thirty, left_index=True, right_index=True, how='inner').rename(columns=lambda x: x.split('_')[0])
df.T.groupby(df.T.index).mean().T

Out:

ALollz · Answer

You can concat being smart about the keys (naming them integers), and then groupby allowing you to interpolate everything:

import pandas as pd

df = pd.concat([twenty, thirty], keys=[20,30], axis=1)
s = (df.groupby(df.columns.get_level_values(1), axis=1)
        .apply(lambda x: x.T.reset_index(1, drop=True).reindex(np.arange(20,31)).interpolate())).T

      20   21   22   23   24   25   26   27   28   29   30
A 0  1.0  1.2  1.4  1.6  1.8  2.0  2.2  2.4  2.6  2.8  3.0
  1  1.0  1.2  1.4  1.6  1.8  2.0  2.2  2.4  2.6  2.8  3.0
  2  1.0  1.2  1.4  1.6  1.8  2.0  2.2  2.4  2.6  2.8  3.0
B 0  3.0  3.4  3.8  4.2  4.6  5.0  5.4  5.8  6.2  6.6  7.0
  1  3.0  3.4  3.8  4.2  4.6  5.0  5.4  5.8  6.2  6.6  7.0
  2  3.0  3.4  3.8  4.2  4.6  5.0  5.4  5.8  6.2  6.6  7.0

Now if you just care about 25:

s[25].unstack(0)

     A    B
0  2.0  5.0
1  2.0  5.0
2  2.0  5.0

How to Interpolate between all Values in Two Separate Pandas DataFrames?

Tags:

python

pandas

interpolation

rhozzy

2 Answers

Naga kiran

ALollz

Recent Activity

Donate For Us

How to Interpolate between all Values in Two Separate Pandas DataFrames?

Tags:

python

pandas

interpolation

rhozzy

2 Answers

Naga kiran

ALollz

Related questions

Recent Activity

Donate For Us