Using this small reproducible example, I've so far been unable to generate a new integer array from 3 arrays that contains unique groupings across all three input arrays.
The arrays are related to topographic properties:
import numpy as np
asp = np.array([8,1,1,2,7,8,2,3,7,6,4,3,6,5,5,4]).reshape((4,4)) #aspect
slp = np.array([9,10,10,9,9,12,12,9,10,11,11,9,9,9,9,9]).reshape((4,4)) #slope
elv = np.array([13,14,14,13,14,15,16,14,14,15,16,14,13,14,14,13]).reshape((4,4)) #elevation
The idea is that the geographic contours are broken into 3 different properties using GIS routines:
The small graphic below attempts to depict the kind of result I'm after (array shown in lower left). Note, the "answer" given in the graphic is but one possible answer. I'm not concerned about the final arrangement of integers in the resulting array so long as the final array contains an integer at each row/column index that identifies unique groupings.
For example, the array indexes at [0,1] and [0,2] have the same aspect, slope, and elevation and therefore receive the same integer identifier in the resulting array.
Does numpy have a built in routine for this kind of thing?
Each location in the grid is associated with a tuple composed of one value from
asp
, slp
and elv
. For example, the upper left corner has tuple (8,9,13)
.
We would like to map this tuple to a number which uniquely identifies this tuple.
One way to do that would be to think of (8,9,13)
as the index into the 3D array
np.arange(9*13*17).reshape(9,13,17)
. This particular array was chosen
to accommodate the largest values in asp
, slp
and elv
:
In [107]: asp.max()+1
Out[107]: 9
In [108]: slp.max()+1
Out[108]: 13
In [110]: elv.max()+1
Out[110]: 17
Now we can map the tuple (8,9,13) to the number 1934:
In [113]: x = np.arange(9*13*17).reshape(9,13,17)
In [114]: x[8,9,13]
Out[114]: 1934
If we do this for each location in the grid, then we get a unique number for each location. We could end right here, letting these unique numbers serve as labels.
Or, we can generate smaller integer labels (starting at 0 and increasing by 1)
by using np.unique
with
return_inverse=True
:
uniqs, labels = np.unique(vals, return_inverse=True)
labels = labels.reshape(vals.shape)
So, for example,
import numpy as np
asp = np.array([8,1,1,2,7,8,2,3,7,6,4,3,6,5,5,4]).reshape((4,4)) #aspect
slp = np.array([9,10,10,9,9,12,12,9,10,11,11,9,9,9,9,9]).reshape((4,4)) #slope
elv = np.array([13,14,14,13,14,15,16,14,14,15,16,14,13,14,14,13]).reshape((4,4)) #elevation
x = np.arange(9*13*17).reshape(9,13,17)
vals = x[asp, slp, elv]
uniqs, labels = np.unique(vals, return_inverse=True)
labels = labels.reshape(vals.shape)
yields
array([[11, 0, 0, 1],
[ 9, 12, 2, 3],
[10, 8, 5, 3],
[ 7, 6, 6, 4]])
The above method works fine as long as the values in asp
, slp
and elv
are small integers. If the integers were too large, the product of their maximums could overflow the maximum allowable value one can pass to np.arange
. Moreover, generating such a large array would be inefficient.
If the values were floats, then they could not be interpreted as indices into the 3D array x
.
So to address these problems, use np.unique
to convert the values in asp
, slp
and elv
to unique integer labels first:
indices = [ np.unique(arr, return_inverse=True)[1].reshape(arr.shape) for arr in [asp, slp, elv] ]
M = np.array([item.max()+1 for item in indices])
x = np.arange(M.prod()).reshape(M)
vals = x[indices]
uniqs, labels = np.unique(vals, return_inverse=True)
labels = labels.reshape(vals.shape)
which yields the same result as shown above, but works even if asp
, slp
, elv
were floats and/or large integers.
Finally, we can avoid the generation of np.arange
:
x = np.arange(M.prod()).reshape(M)
vals = x[indices]
by computing vals
as a product of indices and strides:
M = np.r_[1, M[:-1]]
strides = M.cumprod()
indices = np.stack(indices, axis=-1)
vals = (indices * strides).sum(axis=-1)
So putting it all together:
import numpy as np
asp = np.array([8,1,1,2,7,8,2,3,7,6,4,3,6,5,5,4]).reshape((4,4)) #aspect
slp = np.array([9,10,10,9,9,12,12,9,10,11,11,9,9,9,9,9]).reshape((4,4)) #slope
elv = np.array([13,14,14,13,14,15,16,14,14,15,16,14,13,14,14,13]).reshape((4,4)) #elevation
def find_labels(*arrs):
indices = [np.unique(arr, return_inverse=True)[1] for arr in arrs]
M = np.array([item.max()+1 for item in indices])
M = np.r_[1, M[:-1]]
strides = M.cumprod()
indices = np.stack(indices, axis=-1)
vals = (indices * strides).sum(axis=-1)
uniqs, labels = np.unique(vals, return_inverse=True)
labels = labels.reshape(arrs[0].shape)
return labels
print(find_labels(asp, slp, elv))
# [[ 3 7 7 0]
# [ 6 10 12 4]
# [ 8 9 11 4]
# [ 2 5 5 1]]
This can be done using numpy.unique()
and then a mapping like:
combined = 10000 * asp + 100 * slp + elv
unique = dict(((v, i + 1) for i, v in enumerate(np.unique(combined))))
combined_unique = np.vectorize(unique.get)(combined)
import numpy as np
asp = np.array([8, 1, 1, 2, 7, 8, 2, 3, 7, 6, 4, 3, 6, 5, 5, 4]).reshape((4, 4)) # aspect
slp = np.array([9, 10, 10, 9, 9, 12, 12, 9, 10, 11, 11, 9, 9, 9, 9, 9]).reshape((4, 4)) # slope
elv = np.array([13, 14, 14, 13, 14, 15, 16, 14, 14, 15, 16, 14, 13, 14, 14, 13]).reshape((4, 4))
combined = 10000 * asp + 100 * slp + elv
unique = dict(((v, i + 1) for i, v in enumerate(np.unique(combined))))
combined_unique = np.vectorize(unique.get)(combined)
print(combined_unique)
[[12 1 1 2]
[10 13 3 4]
[11 9 6 4]
[ 8 7 7 5]]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With