I want to make a numpy array that contains how many times a value (between 1-3) occurs at a specific location. For example, if I have:
a = np.array([[1,2,3],
[3,2,1],
[2,1,3],
[1,1,1]])
I want to get back an array like so:
[[[ 1 0 0]
[ 0 1 0]
[ 0 0 1]]
[[ 0 0 1]
[ 0 1 0]
[ 1 0 0]]
[[ 0 1 0]
[ 1 0 0]
[ 0 0 1]]
[[ 1 0 0]
[ 1 0 0]
[ 1 0 0]]]
Where the array tells me that 1 occurs once in the first position, 2 occurs once in the second position, 3 occurs once in the third position, 1 occurs once in the fourth position, etc. Later, I'll have more input arrays of the same dimensions, and I would like to add on the totals of the values to this array of counts.
The code I have right now is:
a = np.array([[1,2,3],
[3,2,1],
[2,1,3],
[1,1,1]])
cumulative = np.zeros((4,3,3))
for r in range(len(cumulative)):
for c in range(len(cumulative[0])):
cumulative[r, c, a[r,c]-1] +=1
This does give me the output I want. However, I would like to condense the for loops into one line, using a line similar to this:
cumulative[:, :, a[:, :]-1] +=1
This line doesn't work, and I can't find anything online on how to perform this operation. Any suggestions?
IIUC, you could take advantage of broadcasting:
In [93]: ((a[:, None] - 1) == np.arange(3)[:, None]).swapaxes(2, 1).astype(int)
Out[93]:
array([[[1, 0, 0],
[0, 1, 0],
[0, 0, 1]],
[[0, 0, 1],
[0, 1, 0],
[1, 0, 0]],
[[0, 1, 0],
[1, 0, 0],
[0, 0, 1]],
[[1, 0, 0],
[1, 0, 0],
[1, 0, 0]]])
It's technically not a one-liner, but if you ignore PEP 8's maximum line length then you can whittle it down to two lines.
a = np.array([[1,2,3],
[3,2,1],
[2,1,3],
[1,1,1]])
out = np.zeros((a.shape[0], 1 + a.max() - a.min(), a.shape[1]), dtype=np.int8)
out[np.repeat(np.arange(a.shape[0]), a.shape[1]), np.subtract(
a, a.min())[:].flatten(), np.tile(np.arange(a.shape[1]), a.shape[0])] = 1
print(out)
Which outputs;
[[[1 0 0]
[0 1 0]
[0 0 1]]
[[0 0 1]
[0 1 0]
[1 0 0]]
[[0 1 0]
[1 0 0]
[0 0 1]]
[[1 0 0]
[1 0 0]
[1 0 0]]]
This is perhaps not the most gainly nor graceful solution, and unfortunately does not scale to n dimensions, but hopefully this (almost one-liner) is sufficiently vectorised for you.
It's quite hefty so I'll briefly run through how this works.
The output array is created full of zeros by default, with the total lengths of the 'one-hot vectors' equal to the range of the input array (I assumed this is what you wanted given that there was no row for the value zero given in your example).
np.tile
and np.repeat
are used with np.arange
to produce the first and last index arrays, that is the indices of each element in a
.
Fancy indicing is used to fill set the indices of a matching number to 1.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With