I got a distance matrix with the following steps:
x <- read.table(textConnection('
t0 t1 t2
aaa 0 1 0
bbb 1 0 1
ccc 1 1 1
ddd 1 1 0
' ), header=TRUE)
As such x
is a data frame with column and row headers
t0 t1 t2
aaa 0 1 0
bbb 1 0 1
ccc 1 1 1
ddd 1 1 0
require(vegan)
d <- vegdist(x, method="jaccard")
The distance matrix d is obtained as follows:
aaa bbb ccc
bbb 1.0000000
ccc 0.6666667 0.3333333
ddd 0.5000000 0.6666667 0.3333333
By typing str(d), I found it is not a ordinary table nor csv format.
Class 'dist' atomic [1:6] 1 0.667 0.5 0.333 0.667 ...
..- attr(*, "Size")= int 4
..- attr(*, "Labels")= chr [1:4] "aaa" "bbb" "ccc" "ddd"
..- attr(*, "Diag")= logi FALSE
..- attr(*, "Upper")= logi FALSE
..- attr(*, "method")= chr "jaccard"
..- attr(*, "call")= language vegdist(x = a, method = "jaccard")
I want to covert the distance matrix to a 3 columns with new headers and save it as a csv file as follows:
c1 c2 distance
aaa bbb 1.000
aaa ccc 0.6666667
aaa ddd 0.5
bbb ccc 0.3333333
bbb ddd 0.6666667
ccc ddd 0.3333333
Convert a Data Frame into a Numeric Matrix in R Programming – data. matrix() Function. data. matrix() function in R Language is used to create a matrix by converting all the values of a Data Frame into numeric mode and then binding them as a matrix.
The dissimilarity matrix (also called distance matrix) describes pairwise distinction between M objects. It is a square symmetrical MxM matrix with the (ij)th element equal to the value of a chosen measure of distinction between the (i)th and the (j)th object.
The distance matrix is widely used in the bioinformatics field, and it is present in several methods, algorithms and programs. Distance matrices are used to represent protein structures in a coordinate-independent manner, as well as the pairwise distances between two sequences in sequence space.
This is quite doable using base R functions. First we want all pairwise combinations of the rows to fill the columns c1
and c2
in the resulting object. The final column distance
is achieved by simply converting the "dist"
object d
into a numeric vector (it already is a vector but of a different class).
The first step is done using combn(rownames(x), 2)
and the second step via as.numeric(d)
:
m <- data.frame(t(combn(rownames(x),2)), as.numeric(d))
names(m) <- c("c1", "c2", "distance")
Which gives:
> m
c1 c2 distance
1 aaa bbb 1.0000000
2 aaa ccc 0.6666667
3 aaa ddd 0.5000000
4 bbb ccc 0.3333333
5 bbb ddd 0.6666667
6 ccc ddd 0.3333333
To save as a CSV file, write.csv(m, file = "filename.csv")
.
you can do this by combining melt from reshape package, upper.tri etc.:
> library(reshape)
> m <- as.matrix(d)
> m
aaa bbb ccc ddd
aaa 0.0000000 1.0000000 0.6666667 0.5000000
bbb 1.0000000 0.0000000 0.3333333 0.6666667
ccc 0.6666667 0.3333333 0.0000000 0.3333333
ddd 0.5000000 0.6666667 0.3333333 0.0000000
> m2 <- melt(m)[melt(upper.tri(m))$value,]
> names(m2) <- c("c1", "c2", "distance")
> m2
c1 c2 distance
5 aaa bbb 1.0000000
9 aaa ccc 0.6666667
10 bbb ccc 0.3333333
13 aaa ddd 0.5000000
14 bbb ddd 0.6666667
15 ccc ddd 0.3333333
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With