Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using pandas.DataFrame.interpolate to add rows to DataFrame

Tags:

python

pandas

I have a Pandas dataframe with the following format:

    Frequency | Value
1   10          2.8
2   20          2.5
3   30          2.2
4   40          2.3

I want to use pandas.DataFrame.interpolate in order to add a line at frequency 35 with a value interpolated linearly between frequencies 30 and 40.

In the user manual the example shows how to replace a Nan but not how to add values in between others (Pandas doc).

What would be the best way to proceed ?

like image 256
Anthony Lethuillier Avatar asked Jan 25 '17 14:01

Anthony Lethuillier


People also ask

How do I add rows to a DataFrame?

Use concat() to Append Use pd. concat([new_row,df. loc[:]]). reset_index(drop=True) to append the row to the first position of the DataFrame as Index starts from zero.

What does pandas interpolate do?

Pandas DataFrame interpolate() Method The interpolate() method replaces the NULL values based on a specified method.

Which function is used to add row in pandas DataFrame?

concat() by creating a new dataframe of all the rows that we need to add and then appending this dataframe to the original dataframe.

What does .values in pandas do?

The values property is used to get a Numpy representation of the DataFrame. Only the values in the DataFrame will be returned, the axes labels will be removed. The values of the DataFrame. A DataFrame where all columns are the same type (e.g., int64) results in an array of the same type.


1 Answers

I think you need first add new value 35 to frequency column by loc, sort_values and then interpolate:

df.loc[-1, 'Frequency'] = 35
df = df.sort_values('Frequency').reset_index(drop=True)
print (df)
   Frequency  Value
0       10.0    2.8
1       20.0    2.5
2       30.0    2.2
3       35.0    NaN
4       40.0    2.3

df = df.interpolate()
print (df)
   Frequency  Value
0       10.0   2.80
1       20.0   2.50
2       30.0   2.20
3       35.0   2.25
4       40.0   2.30

Solution with Series, thank you for idea Rutger Kassies.

DataFrame.squeeze create Series with one column DataFrame.

s = df.set_index('Frequency').squeeze()
s.loc[35] = np.nan
s = s.sort_index().interpolate(method='index')
print (s)
Frequency
10    2.80
20    2.50
30    2.20
35    2.25
40    2.30
Name: Value, dtype: float64
like image 57
jezrael Avatar answered Nov 10 '22 03:11

jezrael