I'm relatively new to R, so forgive me for what I believe to be a relatively simple question.
I have data in the form
1 2 3 4 5
A 0 1 1 0 0
B 1 0 1 0 1
C 0 1 0 1 0
D 1 0 0 0 0
E 0 0 0 0 1
where A-E are people and 1-5 are binaries of whether or not they have that quality. I need to make a matrix of A-E where cell A,B = 1 if the sum of any quality 1-5 for A & B sums to 2. (If they share at least one quality). The simple 5x5 would be:
A B C D E
A 1
B 1 1
C 1 0 1
D 0 1 0 1
E 0 1 0 0 1
I then need to sum the entire matrix. (Above would be 9). I have thousands of observations, so I can't do this by hand. I'm sure there is an easy few lines of code, I'm just not experienced enough.
Thanks!
EDIT: I've imported the data from a .csv file with the columns (1-5 above) as variables, in the real data I have 40 variables. A-E are unique ID observations of people, approximately 2000. I would also like to know how to first convert this into a matrix, in order to execute the great answers you have already provided. Thanks!
In mathematics, matrix addition is the operation of adding two matrices by adding the corresponding entries together. However, there are other operations which could also be considered addition for matrices, such as the direct sum and the Kronecker sum.
S = sum( A , dim ) returns the sum along dimension dim . For example, if A is a matrix, then sum(A,2) is a column vector containing the sum of each row. S = sum( A , vecdim ) sums the elements of A based on the dimensions specified in the vector vecdim .
You can use matrix multiplication here
out <- tcrossprod(m)
# A B C D E
# A 2 1 1 0 0
# B 1 3 0 1 1
# C 1 0 2 0 0
# D 0 1 0 1 0
# E 0 1 0 0 1
Then set the diagonal to one, if required
diag(out) <- 1
As DavidA points out in comments tcrossprod
is a basically doing m %*% t(m)
Several ways to them calculate the sum
l here is one
sum(out[upper.tri(out, diag=TRUE)] , na.rm=TRUE)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With