I have a NumPy array <code>a</code> like the following: <pre class="prettyprint"><code>>>> str(a) '[ nan nan nan 1.44955726 1.44628034 1.44409573\n 1.4408188 1.43657094 1.43171624 1.42649744 1.42200684 1.42117704\n 1.42040255 1.41922908 nan nan nan nan\n nan nan]' </code></pre> I want to replace each NaN with the closest non-NaN value, so that all of the NaN's at the beginning get set to <code>1.449...</code> and all of the NaN's at the end get set to <code>1.419...</code>. I can see how to do this for specific cases like this, but I need to be able to do it generally for any length of array, with any length of NaN's at the beginning and end of the array (there will be no NaN's in the middle of the numbers). Any ideas? I can find the NaN's easily enough with <code>np.isnan()</code>, but I can't work out how to get the closest value to each NaN.

<blockquote> I want to replace each NaN with the closest non-NaN value... there will be no NaN's in the middle of the numbers </blockquote> The following will do it: <pre class="prettyprint"><code>ind = np.where(~np.isnan(a))[0] first, last = ind[0], ind[-1] a[:first] = a[first] a[last + 1:] = a[last] </code></pre> This is a straight <code>numpy</code> solution requiring no Python loops, no recursion, no list comprehensions etc.

<code>NaN</code>s have the interesting property of comparing different from themselves, thus we can quickly find the indexes of the non-nan elements: <pre class="prettyprint"><code>idx = np.nonzero(a==a)[0] </code></pre> it's now easy to replace the nans with the desired value: <pre class="prettyprint"><code>for i in range(0, idx[0]): a[i]=a[idx[0]] for i in range(idx[-1]+1, a.size) a[i]=a[idx[-1]] </code></pre> Finally, we can put this in a function: <pre class="prettyprint"><code>import numpy as np def FixNaNs(arr): if len(arr.shape)>1: raise Exception("Only 1D arrays are supported.") idxs=np.nonzero(arr==arr)[0] if len(idxs)==0: return None ret=arr for i in range(0, idxs[0]): ret[i]=ret[idxs[0]] for i in range(idxs[-1]+1, ret.size): ret[i]=ret[idxs[-1]] return ret </code></pre> edit Ouch, coming from C++ I always forget about list ranges... @aix's solution is way more elegant and efficient than my C++ish loops, use that instead of mine.

Replace NaN's in NumPy array with closest non-NaN value

Tags:

python

arrays

nan

numpy

I have a NumPy array a like the following:

>>> str(a)
'[        nan         nan         nan  1.44955726  1.44628034  1.44409573\n  1.4408188   1.43657094  1.43171624  1.42649744  1.42200684  1.42117704\n  1.42040255  1.41922908         nan         nan         nan         nan\n         nan         nan]'

I want to replace each NaN with the closest non-NaN value, so that all of the NaN's at the beginning get set to 1.449... and all of the NaN's at the end get set to 1.419....

I can see how to do this for specific cases like this, but I need to be able to do it generally for any length of array, with any length of NaN's at the beginning and end of the array (there will be no NaN's in the middle of the numbers). Any ideas?

I can find the NaN's easily enough with np.isnan(), but I can't work out how to get the closest value to each NaN.

804

asked Mar 02 '12 17:03

robintw

3 Answers

As an alternate solution (this will linearly interpolate for arrays NaNs in the middle, as well):

import numpy as np  # Generate data... data = np.random.random(10) data[:2] = np.nan data[-1] = np.nan data[4:6] = np.nan  print data  # Fill in NaN's... mask = np.isnan(data) data[mask] = np.interp(np.flatnonzero(mask), np.flatnonzero(~mask), data[~mask])  print data

This yields:

[        nan         nan  0.31619306  0.25818765         nan         nan   0.27410025  0.23347532  0.02418698         nan]  [ 0.31619306  0.31619306  0.31619306  0.25818765  0.26349185  0.26879605   0.27410025  0.23347532  0.02418698  0.02418698]

137

answered Sep 28 '22 05:09

Joe Kington

I want to replace each NaN with the closest non-NaN value... there will be no NaN's in the middle of the numbers

The following will do it:

ind = np.where(~np.isnan(a))[0]
first, last = ind[0], ind[-1]
a[:first] = a[first]
a[last + 1:] = a[last]

This is a straight numpy solution requiring no Python loops, no recursion, no list comprehensions etc.

answered Sep 28 '22 03:09

NPE

NaNs have the interesting property of comparing different from themselves, thus we can quickly find the indexes of the non-nan elements:

idx = np.nonzero(a==a)[0]

it's now easy to replace the nans with the desired value:

for i in range(0, idx[0]):
    a[i]=a[idx[0]]
for i in range(idx[-1]+1, a.size)
    a[i]=a[idx[-1]]

Finally, we can put this in a function:

import numpy as np

def FixNaNs(arr):
    if len(arr.shape)>1:
        raise Exception("Only 1D arrays are supported.")
    idxs=np.nonzero(arr==arr)[0]

    if len(idxs)==0:
        return None

    ret=arr

    for i in range(0, idxs[0]):
        ret[i]=ret[idxs[0]]

    for i in range(idxs[-1]+1, ret.size):
        ret[i]=ret[idxs[-1]]

    return ret

edit

Ouch, coming from C++ I always forget about list ranges... @aix's solution is way more elegant and efficient than my C++ish loops, use that instead of mine.

answered Sep 28 '22 05:09

Matteo Italia

Related questions
                            
                                Pythonic Circular List
                            
                                Nested dictionary comprehension python
                            
                                DBSCAN for clustering of geographic location data
                            
                                Docker Kafka w/ Python consumer
                            
                                How to make Django template engine to render in memory templates?
                            
                                python selenium, find out when a download has completed?
                            
                                How to create random orthonormal matrix in python numpy
                            
                                Easiest way to turn a list into an HTML table in python?
                            
                                Is it possible to change an instance's method implementation without changing all other instances of the same class? [duplicate]
                            
                                Upper memory limit?
                            
                                Add an item between each item already in the list [duplicate]
                            
                                PySide / PyQt detect if user trying to close window
                            
                                Draw axis lines or the origin for Matplotlib contour plot
                            
                                "Unused import warning" and pylint
                            
                                Python argparse integer condition (>=12)
                            
                                Short Python Code to say "Pick the lower value"?
                            
                                How to Print "Pretty" String Output in Python
                            
                                Import NumPy on PyCharm
                            
                                How to concatenate multiple pandas.DataFrames without running into MemoryError
                            
                                Creating a list in Python with multiple copies of a given object in a single line

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With