Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Force a "dist" diagonal to 1

Tags:

r

Is is possible to force a distance object from the stats package to be something other than 0?

Here I try forcing it to something else and I can alter the upper or lower but not the diagonals:

set.seed(10)
x <- matrix(rnorm(25), ncol = 5)
y <- dist(x, diag =TRUE)
z <- 1 - as.matrix(y)
as.dist(z, diag =TRUE)

gives:

           1          2          3          4          5
1  0.0000000                                            
2 -0.9030066  0.0000000                                 
3 -0.9803571 -1.9319785  0.0000000                      
4 -1.5249747 -2.3673155 -1.5928891  0.0000000           
5 -2.7903980 -2.8020380 -2.2491893 -1.5839067  0.0000000

rather than the expected:

           1          2          3          4          5
1  1.0000000                                            
2 -0.9030066  1.0000000                                 
3 -0.9803571 -1.9319785  1.0000000                      
4 -1.5249747 -2.3673155 -1.5928891  1.0000000           
5 -2.7903980 -2.8020380 -2.2491893 -1.5839067  1.0000000

Maybe I have to output it as a matrix object instead because there's something about forcing the diagonals to be not 0 that causes it to not conform to the way "dist" objects are handled.

like image 500
Tyler Rinker Avatar asked Jan 14 '23 09:01

Tyler Rinker


1 Answers

Not with dist(); it doesn't store the diagonal, just a flag to indicate if it should be printed via the print() method.

This is not unexpect; dist() is a compact way of storing distance matrices, not symmetric matrices in general. In a distance matrix, by definition, the distance between an observation and itself is 0. Hence dist() treats the diagonal as the trivial thing it is and doesn't store it.

If I wanted to do what you want to, I would use the guts of dist() and store the data as dist() does, in a function, say mydist() with class "mydist", but then write print.mydist() taking code from the print.dist() method, but using another value for the diagonal, and write as.matrix.mydist() to do the conversion to a matrix. Your class could could either store the values for the diagonal (if they varied) or just a single value that you want the diagonal to be.

Essentially then, all you'd need to do is store the diagonal value(s) you want as an extra attribute, then provide print() and as.matrix() methods that drew from that attribute to print or populate the matrix.

like image 120
Gavin Simpson Avatar answered Jan 22 '23 06:01

Gavin Simpson