Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find indices of 5 closest samples in distance matrix

Tags:

r

matrix

distance

Users

I have a distance matrix dMat and want to find the 5 nearest samples to the first one. What function can I use in R? I know how to find the closest sample (cf. 3rd line of code), but can't figure out how to get the other 4 samples.

The code:

Mat <- replicate(10, rnorm(10))
dMat <- as.matrix(dist(Mat))
which(dMat[,1]==min(dMat[,1]))

The 3rd line of code finds the index of the closest sample to the first sample.

Thanks for any help!

Best, Chega

like image 844
Chega Avatar asked Jan 16 '13 10:01

Chega


2 Answers

You can use order to do this:

head(order(dMat[-1,1]),5)+1
[1] 10  3  4  8  6

Note that I removed the first one, as you presumably don't want to include the fact that your reference point is 0 distance away from itself.

like image 119
James Avatar answered Oct 19 '22 20:10

James


Alternative using sort:

sort(dMat[,1], index.return = TRUE)$ix[1:6]

It would be nice to add a set.seed(.) when using random numbers in matrix so that we could show the results are identical. I will skip the results here.

Edit (correct solution): The above solution will only work if the first element is always the smallest! Here's the correct solution that will always give the 5 closest values to the first element of the column:

> sort(abs(dMat[-1,1] - dMat[1,1]), index.return=TRUE)$ix[1:5] + 1

Example:

> dMat <- matrix(c(70,4,2,1,6,80,90,100,3), ncol=1)
# James' solution
> head(order(dMat[-1,1]),5) + 1
[1] 4 3 9 2 5 # values are 1,2,3,4,6 (wrong)
# old sort solution
> sort(dMat[,1], index.return = TRUE)$ix[1:6]
[1] 4 3 9 2 5 1 #  values are 1,2,3,4,6,70 (wrong)
# Correct solution
> sort(abs(dMat[-1,1] - dMat[1,1]), index.return=TRUE)$ix[1:5] + 1
[1] 6 7 8 5 2 # values are 80,90,100,6,4 (right)
like image 42
Arun Avatar answered Oct 19 '22 20:10

Arun