Given the matrix,
df <- read.table(text="
 X1 X2 X3 X4 X5
  1  2  3  2  1
  2  3  4  4  3
  3  4  4  6  2
  4  5  5  5  4
  2  3  3  3  6
  5  6  2  8  4", header=T)
I want to create a distance matrix containing the absolute mean difference between each row of each column. For example, the distance between X1 and X3 should be = 1.67 given that:
abs(1 - 3) + abs(2-4) + abs(3-4) + abs(4-5) + abs(2-3) + abs(5-2) = 10 / 6 = 1.67
I have tried using the designdist() function in the vegan package this way:
designdist(t(df), method = "abs(A-B)/6", terms = "minimum")
The resulting distance for columns 1 and 3 is 0.666. The problem with this function is that it sums all the values in each column and then subtracts them. But I need to sum the absolute differences between each row (individually, absolute) and then divide it by N.
Here's a one-line solution. It takes advantage of dist()'s method argument to calculate the L1 norm aka city block distance aka Manhattan distance between each pair of columns in your data.frame.
as.matrix(dist(df, "manhattan", diag=TRUE, upper=TRUE)/nrow(df))
To make it reproducible:
df <- read.table(text="
 X1 X2 X3 X4 X5
  1  2  3  2  1
  2  3  4  4  3
  3  4  4  6  2
  4  5  5  5  4
  2  3  3  3  6
  5  6  2  8  4", header=T)
dmat <- as.matrix(dist(df, "manhattan", diag=TRUE, upper=TRUE)/nrow(df))
print(dmat, digits=3)
#      1     2     3    4     5    6
# 1 0.00 1.167 1.667 2.33 1.333 3.00
# 2 1.17 0.000 0.833 1.17 0.833 2.17
# 3 1.67 0.833 0.000 1.00 1.667 1.67
# 4 2.33 1.167 1.000 0.00 1.667 1.33
# 5 1.33 0.833 1.667 1.67 0.000 2.33
# 6 3.00 2.167 1.667 1.33 2.333 0.00
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With