Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Interpolate NaN values in a numpy array

Is there a quick way of replacing all NaN values in a numpy array with (say) the linearly interpolated values?

For example,

[1 1 1 nan nan 2 2 nan 0] 

would be converted into

[1 1 1 1.3 1.6 2 2  1  0] 
like image 266
Petter Avatar asked Jun 29 '11 09:06

Petter


People also ask

How does NumPy array deal with NaN values?

To check for NaN values in a Numpy array you can use the np. isnan() method. This outputs a boolean mask of the size that of the original array. The output array has true for the indices which are NaNs in the original array and false for the rest.

How do you interpolate a NumPy array in Python?

interp() function returns the one-dimensional piecewise linear interpolant to a function with given discrete data points (xp, fp), evaluated at x. Parameters : x : [array_like] The x-coordinates at which to evaluate the interpolated values.

Does NumPy support NaN?

Save this answer. Show activity on this post. No, you can't, at least with current version of NumPy. A nan is a special value for float arrays only.

How do I get rid of NaN NumPy?

Droping the missing values or nan values can be done by using the function "numpy. isnan()" it will give us the indexes which are having nan values and when combined with other function which is "numpy. logical_not()" where the boolean values will be reversed.


2 Answers

Lets define first a simple helper function in order to make it more straightforward to handle indices and logical indices of NaNs:

import numpy as np  def nan_helper(y):     """Helper to handle indices and logical indices of NaNs.      Input:         - y, 1d numpy array with possible NaNs     Output:         - nans, logical indices of NaNs         - index, a function, with signature indices= index(logical_indices),           to convert logical indices of NaNs to 'equivalent' indices     Example:         >>> # linear interpolation of NaNs         >>> nans, x= nan_helper(y)         >>> y[nans]= np.interp(x(nans), x(~nans), y[~nans])     """      return np.isnan(y), lambda z: z.nonzero()[0] 

Now the nan_helper(.) can now be utilized like:

>>> y= array([1, 1, 1, NaN, NaN, 2, 2, NaN, 0]) >>> >>> nans, x= nan_helper(y) >>> y[nans]= np.interp(x(nans), x(~nans), y[~nans]) >>> >>> print y.round(2) [ 1.    1.    1.    1.33  1.67  2.    2.    1.    0.  ] 

---
Although it may seem first a little bit overkill to specify a separate function to do just things like this:

>>> nans, x= np.isnan(y), lambda z: z.nonzero()[0] 

it will eventually pay dividends.

So, whenever you are working with NaNs related data, just encapsulate all the (new NaN related) functionality needed, under some specific helper function(s). Your code base will be more coherent and readable, because it follows easily understandable idioms.

Interpolation, indeed, is a nice context to see how NaN handling is done, but similar techniques are utilized in various other contexts as well.

like image 167
eat Avatar answered Sep 18 '22 19:09

eat


I came up with this code:

import numpy as np nan = np.nan  A = np.array([1, nan, nan, 2, 2, nan, 0])  ok = -np.isnan(A) xp = ok.ravel().nonzero()[0] fp = A[-np.isnan(A)] x  = np.isnan(A).ravel().nonzero()[0]  A[np.isnan(A)] = np.interp(x, xp, fp)  print A 

It prints

 [ 1.          1.33333333  1.66666667  2.          2.          1.          0.        ] 
like image 41
Petter Avatar answered Sep 19 '22 19:09

Petter