Is there a quick way of replacing all NaN values in a numpy array with (say) the linearly interpolated values?
For example,
[1 1 1 nan nan 2 2 nan 0]
would be converted into
[1 1 1 1.3 1.6 2 2 1 0]
To check for NaN values in a Numpy array you can use the np. isnan() method. This outputs a boolean mask of the size that of the original array. The output array has true for the indices which are NaNs in the original array and false for the rest.
interp() function returns the one-dimensional piecewise linear interpolant to a function with given discrete data points (xp, fp), evaluated at x. Parameters : x : [array_like] The x-coordinates at which to evaluate the interpolated values.
Save this answer. Show activity on this post. No, you can't, at least with current version of NumPy. A nan is a special value for float arrays only.
Droping the missing values or nan values can be done by using the function "numpy. isnan()" it will give us the indexes which are having nan values and when combined with other function which is "numpy. logical_not()" where the boolean values will be reversed.
Lets define first a simple helper function in order to make it more straightforward to handle indices and logical indices of NaNs:
import numpy as np def nan_helper(y): """Helper to handle indices and logical indices of NaNs. Input: - y, 1d numpy array with possible NaNs Output: - nans, logical indices of NaNs - index, a function, with signature indices= index(logical_indices), to convert logical indices of NaNs to 'equivalent' indices Example: >>> # linear interpolation of NaNs >>> nans, x= nan_helper(y) >>> y[nans]= np.interp(x(nans), x(~nans), y[~nans]) """ return np.isnan(y), lambda z: z.nonzero()[0]
Now the nan_helper(.)
can now be utilized like:
>>> y= array([1, 1, 1, NaN, NaN, 2, 2, NaN, 0]) >>> >>> nans, x= nan_helper(y) >>> y[nans]= np.interp(x(nans), x(~nans), y[~nans]) >>> >>> print y.round(2) [ 1. 1. 1. 1.33 1.67 2. 2. 1. 0. ]
---
Although it may seem first a little bit overkill to specify a separate function to do just things like this:
>>> nans, x= np.isnan(y), lambda z: z.nonzero()[0]
it will eventually pay dividends.
So, whenever you are working with NaNs related data, just encapsulate all the (new NaN related) functionality needed, under some specific helper function(s). Your code base will be more coherent and readable, because it follows easily understandable idioms.
Interpolation, indeed, is a nice context to see how NaN handling is done, but similar techniques are utilized in various other contexts as well.
I came up with this code:
import numpy as np nan = np.nan A = np.array([1, nan, nan, 2, 2, nan, 0]) ok = -np.isnan(A) xp = ok.ravel().nonzero()[0] fp = A[-np.isnan(A)] x = np.isnan(A).ravel().nonzero()[0] A[np.isnan(A)] = np.interp(x, xp, fp) print A
It prints
[ 1. 1.33333333 1.66666667 2. 2. 1. 0. ]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With