Most Efficient way to create a symmetric matrix

Tags:

r

I have the following matrix/ dataframe:

> e
  V1 V2 V3 V4 V5
1  0  2  3  4  5
2  0  0  6  8 10
3  0  0  0 12 15
4  0  0  0  0 20
5  0  0  0  0  0

In this case N=5 (number of rows = number of columns). I would like to fill in the missing values in this symmetric matrix (e[1,2]=e[2,1] etc.). Is there a most efficient way to fill in the missing values (N the matrix size in my case quite big)? Is there a better way than nested loops?

902

asked Sep 24 '15 20:09

Avi

2 Answers

Just for completion I would like to also show this technique. The addition of the transpose will not work if the lower part of the matrix (under the diagonal) has values filled in, as it will add them to the upper part of the matrix.

Using the Matrix package we can create a sparse Matrix, which in case of creating the symmetric of a big matrix would require much less memory and even speed it up.

In order to create a symmetric sparse matrix from matrix e we would do:

library(Matrix)
rowscols <- which(upper.tri(e), arr.ind=TRUE)
sparseMatrix(i=rowscols[,1],    #rows to fill in
             j=rowscols[,2],    #cols to fill in
             x=e[upper.tri(e)], #values to use (i.e. the upper values of e)
             symmetric=TRUE,    #make it symmetric
             dims=c(nrow(e),nrow(e))) #dimensions

Output:

5 x 5 sparse Matrix of class "dsCMatrix"

[1,] .  2  3  4  5
[2,] 2  .  6  8 10
[3,] 3  6  . 12 15
[4,] 4  8 12  . 20
[5,] 5 10 15 20  .

Microbenchmark:

Let's make a function to make a symmetric Matrix out of a matrix (copies the upper part of matrix to the lower by default):

symmetrise <- function(mat){
  rowscols <- which(upper.tri(mat), arr.ind=TRUE)
  sparseMatrix(i=rowscols[,1], 
               j=rowscols[,2], 
               x=mat[upper.tri(mat)], 
               symmetric=TRUE, 
               dims=c(nrow(mat),ncol(mat)) )  
}

And test:

> microbenchmark(
e + t(e),
symmetrise(e),
e[lower.tri(e)] <- t(e)[lower.tri(e)],
times=1000
)
Unit: microseconds
                                  expr      min       lq      mean   median        uq       max neval cld
                              e + t(e)   75.946   99.038  117.1984  110.841  134.9590   246.825  1000 a  
                         symmetrise(e) 5530.212 6246.569 6950.7681 6921.873 7034.2525 48662.989  1000   c
 e[lower.tri(e)] <- t(e)[lower.tri(e)]  261.193  322.771  430.4479  349.968  395.3815 36873.894  1000  b

As you can see symmetrise is actually much slower than e + t(e) or df[lower.tri(df)] <- t(df)[lower.tri(df)] but at least you have a function that automatically symmetrises a matrix (takes the upper part and creates the lower by default) and in case you have a big matrix where memory is an issue, this might come in handy.

P.S. Wherever you see a . in the Matrix this represents a zero. By using a different system a sparse Matrix is a kind of 'compressed' object making it more memory efficient.

111

answered Nov 13 '22 14:11

LyzandeR

Also for speed:

2*symmpart(as.matrix(e))

Here's a benchmark:

Unit: microseconds
                                      expr      min       lq        mean    median        uq       max neval
                                  e + t(e)  572.505  597.194  655.132028  611.5420  628.4860  8424.902  1000
                             symmetrise(e) 1128.220 1154.562 1215.740071 1167.0020 1185.6585 10656.059  1000
 e[lower.tri(e)] <- e[upper.tri(e, FALSE)]  285.013  311.191  350.846885  327.1335  339.5910  8106.006  1000
                2 * symmpart(as.matrix(e))   78.392   93.953  101.330522  102.1860  107.9215   153.628  1000

It can get this speed because it creates a symmetric matrix directly.

answered Nov 13 '22 14:11

Neal Fultz

Related questions
                            
                                Histogram with "negative" logarithmic scale in R
                            
                                In R, Merge two data frames, fill down the blanks
                            
                                split string with regex
                            
                                Why doesn't the plyr package use my parallel backend?
                            
                                Simple method of counting non-NAs in column of data String [duplicate]
                            
                                Subset variables in data frame based on column type
                            
                                R calculate the standard error using bootstrap
                            
                                Passing large matrices to RcppArmadillo function without creating copy (advanced constructors)
                            
                                Efficient method to subset drop rows with NA values in R
                            
                                Count the number of pattern matches in a string
                            
                                How can I extract factor loadings from lavaan?
                            
                                mean( ,na.rm=TRUE) still returns NA
                            
                                Replace text that appears at the end of a string
                            
                                Use string as filter in dplyr?
                            
                                How to build a crossword-like plot for a boolean matrix
                            
                                R: find vector in list of vectors
                            
                                Search for and remove outliers from a dataframe grouped by a variable
                            
                                R-ranking values of a column by grouping, conditional to another variable
                            
                                Basic - T-Test -> Grouping Factor Must have Exactly 2 Levels
                            
                                data.table replacing a value by NA [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Most Efficient way to create a symmetric matrix

Tags:

loops

r

Avi

People also ask

2 Answers

LyzandeR

Neal Fultz

Recent Activity

Donate For Us