Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does interp1d in scipy give a NaN when the first 2 values of the x-array are identical? (fill_value = 0)

import numpy as np
from scipy.interpolate import interp1d

x = np.array([ 0,  0,   0,  0,   0,  30])
time = np.array([ 5,  5,  10,  10,  10,  20])

intx = interp1d(time,x,'linear', 0, True, False, 0)
print intx([4,5,5,6,10,11,20, 20.0001])

>>> [  0.  nan  nan   0.   0.   3.  30.   0.]

As you can see, in all cases except where the time value == the first pair of values, the interpolant returns a real number.

I am aware of numpy.unique(), this is just an academic question. This is Anaconda Python 2.7 running in iPython.

Thanks!

like image 470
SkinnyTony Avatar asked Sep 25 '13 07:09

SkinnyTony


2 Answers

Your problem is that you are trying to interpolate points that are outside the interval, this causes that scipy.interpolate.interp1d launches a RuntimeWarning when it tries to calculate the slope between two points (it happens in interpolate.py around line 416):

slope = (y_hi - y_lo) / (x_hi - x_lo)[:, None]

See what happens when you move your points within the interval:

>>> import numpy as np
>>> from scipy.interpolate import interp1d
>>> x = np.array([ 5,  5,  10,  10,  10,  20])
>>> y = np.array([ 0,  0,   0,  0,   0,  30])
>>> X = np.array([5.1,5.1,5.1,6,10,11,20, 19.999])
>>> f = interp1d(x,y,'linear', 0, True, False, 0)
>>> Y = f(X)
 [  0.      0.      0.      0.      0.      3.     30.     29.997]

If you plot it you could see that all makes sense:

enter image description here

This is how interp1d works:

  1. You pass x and yto interp1d and it creates a f callable method
  2. Then you pass the new x_new values in which you want to evaluate f and it performs the following steps:

    • Find where in the original data, the values to interpolate would be inserted.

      >>> x_new_indices = np.searchsorted(x, X)
      
    • Clip x_new_indices so that they are within the range of x indices and at least 1. Removes mis-interpolation of x_new[n] = x[0]

      >>> x_new_indices = x_new_indices.clip(1, len(x)-1).astype(int)
      
    • Calculate the slope of regions that each x_new value falls in.

      >>> lo = x_new_indices - 1
      >>> hi = x_new_indices
      >>> x_lo = x[lo]
      >>> x_hi = x[hi]
      >>> y_lo = y[lo]
      >>> y_hi = y[hi]
      
    • Calculate the actual value for each entry in x_new.

      >>> slope = (y_hi - y_lo) / (x_hi - x_lo)[:, None]
      >>> y_new = slope*(x_new - x_lo)[:, None] + y_lo
      
like image 171
jabaldonedo Avatar answered Sep 24 '22 22:09

jabaldonedo


In the above case, I'd suggest just sample points for the Y variable. Eg. consider the following points.

x= [275, 275]
y= [120, 120]

the above points represent a line parallel to the Y-axis. Therefore, the slope of the line is undefined. So, here you can sample points just for the Y variable and replicate value of the X variable for each of them. You will find the below plots intuitive.
Plot 1 -
Two initial points

Plot 2 -
Sample only y points keeping x same!

like image 40
Niranjan Agnihotri Avatar answered Sep 26 '22 22:09

Niranjan Agnihotri