How to use `Series.interpolate` in pandas with the old values modified

Tags:

The interploate method in pandas use the valid data to interpolate the nan values. However, it keeps the old valid data unchanged as the following codes.

Is there any way to use interploate method with the old values changed such that the series become smooth?

In [1]: %matplotlib inline
In [2]: from scipy.interpolate import UnivariateSpline as spl
In [3]: import numpy as np
In [4]: import pandas as pd
In [5]: samples = { 0.0: 0.0, 0.4: 0.5, 0.5: 0.9, 0.6: 0.7, 0.8:0.3, 1.0: 1.0 }
In [6]: x, y = zip(*sorted(samples.items()))

In [7]: df1 = pd.DataFrame(index=np.linspace(0, 1, 31), columns=['raw', 'itp'], dtype=float)

In [8]: df1.loc[x] = np.array(y)[:, None]
In [9]: df1['itp'].interpolate('spline', order=3, inplace=True)
In [10]: df1.plot(style={'itp': 'b-', 'raw': 'rs'}, figsize=(8, 6))

enter image description here

In [11]: df2 = pd.DataFrame(index=np.linspace(0, 1, 31), columns=['raw', 'itp'], dtype=float)
In [12]: df2.loc[x, 'raw'] = y
In [13]: f = spl(x, y, k=3)
In [14]: df2['itp'] = f(df2.index)
In [15]: df2.plot(style={'itp': 'b-', 'raw': 'rs'}, figsize=(8, 6))

enter image description here

991

asked Aug 15 '15 09:08

Eastsun

1 Answers

When you use Series.interpolate with method='spline', under the hood Pandas uses interpolate.UnivariateSpline.

The spline returned by UnivariateSpline is not guaranteed to pass through the data points given as input unless s=0. However, by default s=None, which uses a different smoothing factor and thus leads to a different result.

The Series.interpolate method always fills in NaN values without changing the non-NaN values. There is no way to make Series.interpolate modify the non-NaN values. So, when s != 0, the result produces jagged jumps.

So if you want the s=None (default) spline interpolation but without the jagged jumps, as you've already found, you have to call UnivariateSpline directly and overwrite all the values in df['itp']:

df['itp'] = interpolate.UnivariateSpline(x, y, k=3)(df.index)

If you want a cubic spline that passes through all the non-NaN data points, then use s=0

df['itp'].interpolate('spline', order=3, s=0, inplace=True)

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import scipy.interpolate as interpolate

samples = { 0.0: 0.0, 0.4: 0.5, 0.5: 0.9, 0.6: 0.7, 0.8:0.3, 1.0: 1.0 }
x, y = zip(*sorted(samples.items()))

fig, ax = plt.subplots(nrows=3, sharex=True)
df1 = pd.DataFrame(index=np.linspace(0, 1, 31), columns=['raw', 'itp'], dtype=float)
df1.loc[x] = np.array(y)[:, None]

df2 = df1.copy()
df3 = df1.copy()

df1['itp'].interpolate('spline', order=3, inplace=True)
df2['itp'] = interpolate.UnivariateSpline(x, y, k=3)(df2.index)
df3['itp'].interpolate('spline', order=3, s=0, inplace=True)
for i, df in enumerate((df1, df2, df3)):
    df.plot(style={'itp': 'b-', 'raw': 'rs'}, figsize=(8, 6), ax=ax[i])
plt.show()

enter image description here

answered Sep 17 '22 02:09

unutbu

Related questions
                            
                                Preventing multiple calls in list comprehension [duplicate]
                            
                                Create a DataFrame from a dictionary of DataFrames
                            
                                Django Rest Framework: How do I order/sort a search/filter query?
                            
                                How can I remove all different script tags in BeautifulSoup?
                            
                                failed to wrap function with lambda [duplicate]
                            
                                Import app in django project
                            
                                python algorithm to be done in a pythonic fashion?
                            
                                matplotlib connecting wrong points in line graph
                            
                                Understanding IndentationErrors in Python 2.7
                            
                                Orange terminal text
                            
                                Mongoengine query set to list conversion
                            
                                Extract a weekday() from an SQLAlchemy InstrumentedAttribute (Column type is datetime)
                            
                                ANTRL4: Can't get Python ANTLR to generate a graphic of the parse tree
                            
                                Maximum Beaglebone Black UART baud?
                            
                                Why can you assign values to built-in functions in Python?
                            
                                Numpy function to get shape of added arrays
                            
                                Flask SqlAlchemy begin and end transaction
                            
                                Python predicate function name convention
                            
                                Python:How to change the position in the file from current file position? [duplicate]
                            
                                How do I obtain a mask, reversing numpy.flatnonzero?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to use `Series.interpolate` in pandas with the old values modified

Tags:

python

pandas

interpolation

Eastsun

People also ask

1 Answers

unutbu

Recent Activity

Donate For Us