Python: Replacing values in an array

Question

I have a 1 dimensional data set with some no data values which are set as 9999. Here is an extract as it is quite long:

this_array = [   4,    4,    1, 9999, 9999, 9999,   -5,   -4, ... ]

I would like to replace the no data values with the average of the closest values on either side, however as some no data values have closest values as no data values as well, replacing them is a little harder. i.e. I would like the three no data values to be replaced with -2. I have created a loop to go through each of the scalars in the array and test for no data:

for k in this_array:
    if k == 9999:
        temp = np.where(k == 9999, (abs(this_array[k-1]-this_array[k+1])/2), this_array[k])
    else:
        pass
this_array[k] = temp

However I need to add in an if function or way to take the value before k-1 or after k+1 if that also is equal to 9999 e.g:

if np.logical_or(k+1 == 9999, k-1 == 9999):
    temp = np.where(k == 9999, (abs(this_array[k-2]-this_array[k+2])/2), this_array[k])

As one can tell, this code gets messy as one may end up taking the wrong value or ending up with loads of nested if functions. Does anyone know of a cleaner way to implement this as it's pretty variable throughout the dataset?

As requested: If the first and/or last points are no data, they would preferably be replaced with the closest data point.

Andrew Clark · Accepted Answer

There may be a more efficeint way to do this with numpy functions, but here is a solution using the itertools module:

from itertools import groupby

for k, g in groupby(range(len(this_array)), lambda i: this_array[i] == 9999):
    if k:
        indices = list(g)
        new_v = (this_array[indices[0]-1] + this_array[indices[-1]+1]) / 2
        this_array[indices[0]:indices[-1]+1].fill(new_v)

If the last element or first element can be 9999, you use the following:

from itertools import groupby

for k, g in groupby(range(len(this_array)), lambda i: this_array[i] == 9999):
    if k:
        indices = list(g)
        prev_i, next_i = indices[0]-1, indices[-1]+1
        before = this_array[prev_i] if prev_i != -1 else this_array[next_i]
        after = this_array[next_i] if next_i != len(this_array) else before
        this_array[indices[0]:next_i].fill((before + after) / 2)

Example using second version:

>>> from itertools import groupby
>>> this_array = np.array([9999, 4, 1, 9999, 9999, 9999, -5, -4, 9999])
>>> for k, g in groupby(range(len(this_array)), lambda i: this_array[i] == 9999):
...     if k:
...         indices = list(g)
...         prev_i, next_i = indices[0]-1, indices[-1]+1
...         before = this_array[prev_i] if prev_i != -1 else this_array[next_i]
...         after = this_array[next_i] if next_i != len(this_array) else before
...         this_array[indices[0]:next_i].fill((before + after) / 2)
...
>>> this_array
array([ 4,  4,  1, -2, -2, -2, -5, -4, -4])

NPE · Answer

I'd do something along the following lines:

import numpy as np

def fill(arr, fwd_fill):
  out = arr.copy()
  if fwd_fill:
    start, end, step = 0, len(out), 1
  else:
    start, end, step = len(out)-1, -1, -1
  cur = out[start]
  for i in range(start, end, step):
    if np.isnan(out[i]):
      out[i] = cur
    else:
      cur = out[i]
  return out

def avg(arr):
  fwd = fill(arr, True)
  back = fill(arr, False)
  return (fwd[:-2] + back[2:]) / 2.

arr = np.array([   4,    4,    1, np.nan, np.nan, np.nan,   -5,   -4])
print arr
print avg(arr)

The first function can do either a forward or a backward fill, replacing every NaN with the nearest non-NaN.

Once you have that, computing the average is trivial, and is done by the second function.

You don't say how you want the first and the last element handled, so the code just chops them off.

Finally, it is worth noting that the function can return NaNs if either the first or the last element of the input array are missing (in which case there's no data to compute some of the averages).

Python: Replacing values in an array

Tags:

python

arrays

numpy

median

interpolation

AJEnvMap

2 Answers

Andrew Clark

NPE

Recent Activity

Donate For Us

Python: Replacing values in an array

Tags:

python

arrays

numpy

median

interpolation

AJEnvMap

2 Answers

Andrew Clark

NPE

Related questions

Recent Activity

Donate For Us