Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Mean absolute difference of two numpy arrays

I have two one-dimensional NumPy arrays X and Y. I need to calculate the mean absolute difference between each element of X and each element of Y. The naive way is to use a nested for loop:

import numpy as np
np.random.seed(1)
X = np.random.randint(10, size=10)
Y = np.random.randint(10, size=10)

s = 0
for x in X:
    for y in Y:
        s += abs(x - y)
mean = s / (X.size * Y.size)
#3.4399999999999999

Question: Does NumPy provide a vectorized, faster version of this solution?

Edited: I need the mean absolute difference (always non-negative). Sorry for the confusion.

like image 351
DYZ Avatar asked Jan 28 '23 01:01

DYZ


2 Answers

If I correctly understand what your definition is here, you can just use broadcasting.

np.mean(np.abs(X[:, None] - Y))
like image 76
miradulo Avatar answered Jan 30 '23 15:01

miradulo


If you tile on opposite axes, then you can abs the diff like:

Code:

x = np.tile(X, (X.size, 1))
y = np.transpose(np.tile(Y, (Y.size, 1)))

mean_diff = np.sum(np.abs(x-y)) / (X.size * Y.size))

Test Code:

import numpy as np
X = np.random.randint(10, size=10)
Y = np.random.randint(10, size=10)

s = 0
for x in X:
    for y in Y:
        s += abs(x - y)
mean = s / (X.size * Y.size)
print(mean)

x = np.tile(X, (X.size, 1))
y = np.transpose(np.tile(Y, (Y.size, 1)))

print(np.sum(np.abs(x-y)) / (X.size * Y.size))

Results:

3.48
3.48
like image 35
Stephen Rauch Avatar answered Jan 30 '23 16:01

Stephen Rauch