Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Save a list of dist objects to table in R

i need to create a file with multiple distance matrices separated with an empty line. The output should look like this:

# First matrix
0.05194497                                                       
0.04652118 0.12935323                                            
0.04269506 0.09953116 0.08464824                                 
         NA         NA         NA         NA                      
0.02884847 0.07769535 0.05385956 0.04588298         NA           
0.03821721 0.12084543 0.13431270 0.06928795         NA 0.05123967
# Empty line

# Second matrix
0.05194497                                                       
0.04652118 0.12935323                                            
0.04269506 0.09953116 0.08464824                                 
        NA         NA         NA         NA                      
0.02884847 0.07769535 0.05385956 0.04588298         NA           
0.03821721 0.12084543 0.13431270 0.06928795         NA 0.05123967

I have like 100 distance matrixes in a list in R and i need to export them to a txt-file as shown in the above example. Anyone has an idea how to do this? I need a single file and not multiple txt-files.

like image 746
Curlew Avatar asked Mar 20 '13 10:03

Curlew


1 Answers

Here's one option using sink, lapply, and dput. In order to use write.table on a dist object, it needs to be a matrix, so in the lapply step, we convert it to a matrix, then manually set the diagonal and upper triangle to NA before writing our output.

Here is some sample data:

set.seed(1)
x <- matrix(rnorm(100), nrow = 5)
y <- matrix(rnorm(100), nrow = 5)
myList <- list(A = dist(x),
               B = dist(y))
myList
# $A
#          1        2        3        4
# 2 5.701817                           
# 3 6.013119 5.032069                  
# 4 7.276905 5.325473 5.811861         
# 5 6.619295 5.306750 4.945987 6.612081
# 
# $B
#          1        2        3        4
# 2 7.469565                           
# 3 5.717330 6.407709                  
# 4 5.371346 6.106838 5.057519         
# 5 6.029762 6.256703 4.685266 5.452838

Here's how you can write the output to a file with some blank lines in between. There is also a line NULL after each matrix has been printed that could easily be removed.

sink("myDistList.txt", type="output")
invisible(
  lapply(myList, function(x) { 
    y <- as.matrix(x)
    y[upper.tri(y)] <- NA
    diag(y) <- NA
    dput(write.table(y, row.names = FALSE,
                     col.names = FALSE, na = ""))
  cat("\n\n")
  }))
sink()

Opening up "myDistList.txt" should give you something that looks like this:

5.70181650842794    
6.01311946994002 5.03206860827638   
7.27690516432265 5.32547302778382 5.8118611864786  
6.61929500038789 5.3067497799772 4.94598733972826 6.61208111472781 
NULL



7.46956498920544    
5.7173301814994 6.40770896281359   
5.37134559156135 6.10683846835378 5.05751911328028  
6.02976206855185 6.25670324709768 4.68526645722475 5.45283785882534 
NULL

Of course, capture.output(myList, file = "myDistList.txt") would also get you very close to your desired output--but it would be just as if you printed myList to screen (that is, it will include row and column names). Some clever regex work should be able to remove the extra lines easily though, if you decide to go that route.

For example, using "geany" as my text editor for the output of capture.output, I was able to clean up the text file with the following search-and-replace options (with "Use regular expressions" selected, of course):

  1. Search for ^\s+.*|^\$.* and replace with a single space
  2. Search for ^[0-9]+\s(.*) and replace with \1

In a way, I prefer this method to having to go back through and convert the distance matrix to a matrix and so on.

like image 141
A5C1D2H2I1M1N2O1R2T1 Avatar answered Sep 25 '22 05:09

A5C1D2H2I1M1N2O1R2T1