I am working with a 2D Numpy masked_array in Python. I need to change the data values in the masked area such that they equal the nearest unmasked value.
NB. If there are more than one nearest unmasked values then it can take any of those nearest values (which ever one turns out to be easiest to code…)
e.g.
import numpy
import numpy.ma as ma
a = numpy.arange(100).reshape(10,10)
fill_value=-99
a[2:4,3:8] = fill_value
a[8,8] = fill_value
a = ma.masked_array(a,a==fill_value)
>>> a [[0 1 2 3 4 5 6 7 8 9]
[10 11 12 13 14 15 16 17 18 19]
[20 21 22 -- -- -- -- -- 28 29]
[30 31 32 -- -- -- -- -- 38 39]
[40 41 42 43 44 45 46 47 48 49]
[50 51 52 53 54 55 56 57 58 59]
[60 61 62 63 64 65 66 67 68 69]
[70 71 72 73 74 75 76 77 78 79]
[80 81 82 83 84 85 86 87 -- 89]
[90 91 92 93 94 95 96 97 98 99]],
>>> a.data [[0 1 2 3 4 5 6 7 8 9] [10 11 12 13 14 15 16 17 18 19] [20 21 22 ? 14 15 16 ? 28 29] [30 31 32 ? 44 45 46 ? 38 39] [40 41 42 43 44 45 46 47 48 49] [50 51 52 53 54 55 56 57 58 59] [60 61 62 63 64 65 66 67 68 69] [70 71 72 73 74 75 76 77 78 79] [80 81 82 83 84 85 86 87 ? 89] [90 91 92 93 94 95 96 97 98 99]],
NB. where "?" could take any of the adjacent unmasked values.
What is the most efficient way to do this?
Thanks for your help.
I generally use a distance transform, as wisely suggested by Juh_ in this question.
This does not directly apply to masked arrays, but I do not think it will be that hard to transpose there, and it is quite efficient, I've had no problem applying it to large 100MPix images.
Copying the relevant method there for reference :
import numpy as np
from scipy import ndimage as nd
def fill(data, invalid=None):
"""
Replace the value of invalid 'data' cells (indicated by 'invalid')
by the value of the nearest valid data cell
Input:
data: numpy array of any dimension
invalid: a binary array of same shape as 'data'. True cells set where data
value should be replaced.
If None (default), use: invalid = np.isnan(data)
Output:
Return a filled array.
"""
#import numpy as np
#import scipy.ndimage as nd
if invalid is None: invalid = np.isnan(data)
ind = nd.distance_transform_edt(invalid, return_distances=False, return_indices=True)
return data[tuple(ind)]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With