I have found the mahalanobis.dist function in package StatMatch (http://cran.r-project.org/web/packages/StatMatch/StatMatch.pdf) but it isn't doing exactly what I want. It seems to be calculating the mahalanobis distance from each observation in data.y to each observation in data.x
I would like to calculate the mahalanobis distance of one observation in data.y to all observations in data.x. Basically calculate a mahalanobis distance of one point to a "cloud" of points if that makes sense. Kind of getting at the idea of the probability of an observation being part of another group of observations
This person (http://people.revoledu.com/kardi/tutorial/Similarity/MahalanobisDistance.html) seems to be doing this and I've tried to replicate his process in R but it is failing when I get to the bottom part of the equation:
mahaldist = sqrt((inversepooledcov %*% t(meandiffmatrix)) %*% meandiffmatrix)
All the code I am working with is here:
a = rbind(c(2,2), c(2,5), c(6,5),c(7,3))
colnames(a) = c('x', 'y')
b = rbind(c(6,5),c(3,4))
colnames(b) = c('x', 'y')
acov = cov(a)
bcov = cov(b)
meandiff1 = mean(a[,1]) - mean(b[,1])
meandiff2 = mean(a[,2]) - mean(b[,2])
meandiffmatrix = rbind(c(meandiff1,meandiff2))
totaldata = dim(a)[1] + dim(b)[1]
pooledcov = (dim(a)[1]/totaldata * acov) + (dim(b)[1]/totaldata * bcov)
inversepooledcov = solve(pooledcov)
mahaldist = sqrt((inversepooledcov %*% t(meandiffmatrix)) %*% meandiffmatrix)
How about using the mahalanobis
function in the stats
package:
mahalanobis(x, center, cov, inverted = FALSE, ...)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With