Let say I have two numPy arrays arr1and arr2:
arr1 = np.random.randint(3, size = 100)
arr2 = np.random.randint(3, size = 100)
I would like to build a matrix that contains the number of joint occurrences.
In other words, for all the values of arr1 that are 0, find the elements in arr2 that are also 0 and are located at the same position. And so, I would like to get the following matrix:
M = [[p(0,0), p(0,1), p(0,2)],
[p(1,0), p(1,1), p(1,2)],
[p(2,0), p(2,1), p(2,2)]]
Where p(0,0)stands for the number of occurrences that are 0 on arr1and 0 on arr2.
First Attempt:
As a first attempt I have tried the following:
[[sum(arr1[arr2 == y] == x) for x in np.arange(0,3)] for y in np.arange(0,3)]
But python throws the following error:
NameError: name 'arr1' is not defined
Second Attempt:
I tried to dig into this error by making use of for-loops:
M = np.array([])
for x in np.arange(0,dim):
result = np.array([])
for y in np.arange(0,dim):
result_temp = sum(arr1[arr2 == x] == y)
result = np.append(result, result_temp)
M = np.append(M,result)
In this case Python does not throw the previous Error, but instead of getting a 3x3 array, I get a 1x9 array, and I am not able to get the desired 3x3 array.
Thanks in advance.
Your first list comprehension works. You won't get a NameError if arr1 is defined:
import numpy as np
np.random.seed(2016)
arr1 = np.random.randint(3, size = 100)
arr2 = np.random.randint(3, size = 100)
result = [[sum(arr1[arr2 == y] == x) for x in np.arange(0,3)]
for y in np.arange(0,3)]
print(result)
# [[10, 9, 10], [8, 13, 15], [18, 8, 9]]
But you could instead use np.histogram2d:
result2, xedges, yedges = np.histogram2d(arr2, arr1, bins=range(4))
print(result2)
yields
[[ 10. 9. 10.]
[ 8. 13. 15.]
[ 18. 8. 9.]]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With