Is there a simple approach to converting a data frame with dummies (binary coded) on whether an aspect is present, to a co-occurrence matrix containing the counts of two aspects co-occuring? E.g. going from this <pre class="prettyprint"><code>X <- data.frame(rbind(c(1,0,1,0), c(0,1,1,0), c(0,1,1,1), c(0,0,1,0))) X X1 X2 X3 X4 1 1 0 1 0 2 0 1 1 0 3 0 1 1 1 4 0 0 1 0 </code></pre> to this <pre class="prettyprint"><code> X1 X2 X3 X4 X1 0 0 1 0 X2 0 0 2 1 X3 1 2 0 1 X4 0 1 1 0 </code></pre>

This will do the trick: <pre class="prettyprint"><code>X <- as.matrix(X) out <- crossprod(X) # Same as: t(X) %*% X diag(out) <- 0 # (b/c you don't count co-occurrences of an aspect with itself) out # [,1] [,2] [,3] [,4] # [1,] 0 0 1 0 # [2,] 0 0 2 1 # [3,] 1 2 0 1 # [4,] 0 1 1 0 </code></pre> To get the results into a data.frame exactly like the one you showed, you can then do something like: <pre class="prettyprint"><code>nms <- paste("X", 1:4, sep="") dimnames(out) <- list(nms, nms) out <- as.data.frame(out) </code></pre>

Create a co-occurrence matrix from dummy-coded observations

Tags:

r

Is there a simple approach to converting a data frame with dummies (binary coded) on whether an aspect is present, to a co-occurrence matrix containing the counts of two aspects co-occuring?

E.g. going from this

X <- data.frame(rbind(c(1,0,1,0), c(0,1,1,0), c(0,1,1,1), c(0,0,1,0)))
X
  X1 X2 X3 X4
1  1  0  1  0
2  0  1  1  0
3  0  1  1  1
4  0  0  1  0

to this

   X1 X2 X3 X4
X1  0  0  1  0
X2  0  0  2  1
X3  1  2  0  1
X4  0  1  1  0

275

asked May 16 '12 16:05

mhermans

1 Answers

This will do the trick:

X <- as.matrix(X)
out <- crossprod(X)  # Same as: t(X) %*% X
diag(out) <- 0       # (b/c you don't count co-occurrences of an aspect with itself)
out
#      [,1] [,2] [,3] [,4]
# [1,]    0    0    1    0
# [2,]    0    0    2    1
# [3,]    1    2    0    1
# [4,]    0    1    1    0

To get the results into a data.frame exactly like the one you showed, you can then do something like:

nms <- paste("X", 1:4, sep="")
dimnames(out) <- list(nms, nms)
out <- as.data.frame(out)

154

answered Oct 09 '22 12:10

Josh O'Brien

Related questions
                            
                                Add ylab to ggplot with fivethirtyeight ggtheme
                            
                                dynamic ggplot layers in shiny with nearPoints()
                            
                                Principal component analysis (PCA) of time series data: spatial and temporal pattern
                            
                                Why does is.na() change its argument?
                            
                                How to suppress automatic figure numbering in Rmarkdown / pandoc
                            
                                How to filter on partial match using sparklyr
                            
                                How to specify the size of a graph in ggplot2 independent of axis labels
                            
                                Change color of error messages in RMarkdown code output (HTML, PDF)
                            
                                Pipe operator %>% error with seq() function in R
                            
                                dplyr: Use a custom function in summarize() after group_by()
                            
                                in R dplyr why do I need to ungroup() after I count()?
                            
                                RStudio not finding RTools
                            
                                Equivalent of `break` in purrr::map
                            
                                geom_path() refuses to cross over the 0/360 line in coord_polar()
                            
                                information on .o files for x64 is not available: NOTE on R package checks using Rcpp
                            
                                Manual annotate a ggplot with different labels, in different facets
                            
                                Sending a string from R to C++
                            
                                Reading user input without echoing
                            
                                How to handle binary strings in R?
                            
                                Summing rows based on specific factor combinations

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With