Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sum columns by group (row names) in a matrix

Let's say I have a matrix called x.

x <- structure(c(1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1), 
.Dim = c(5L, 4L), .Dimnames = list(c("Cake", "Pie", "Cake", "Pie", "Pie"),
c("Mon", "Tue", "Wed", "Thurs"))) 

x
     Mon   Tue   Wed   Thurs
Cake   1     0     1      1
Pie    0     0     1      1
Cake   1     1     0      1
Pie    0     0     1      1
Pie    0     0     1      1

I want to sum each column grouped by row names:

     Mon   Tue   Wed   Thurs
Cake   2     1     1      2
Pie    0     0     3      3

I've tried using addmargins(x), but that just gives me the sum of each column and row. Any suggestions? I searched other questions, but couldn't figure this out.

like image 802
pomegranate Avatar asked Apr 05 '15 21:04

pomegranate


People also ask

How to sum the sum of two columns in matrix?

If so, one way to achieve this is in your design tool, go to column (row) group, then add the total for the correspond group like below: The sum will automatically pop out to the right down corner of the matrix. If this is not what you need, please kindly clarify your request. Hope this could be helpful.

How do you find the sum of each row in MATLAB?

Given a matrix of order m×n, the task is to find out the sum of each row and each column of a matrix. The sum of each row and each column can be calculated by traversing through the matrix and adding up the elements. Below is the implementation of the above approach: echo "Sum of the row " . $i .

How to find the row sum for each column in R?

How to find the row sum for each column by row name in an R matrix? To find the row sum for each column by row name, we can use rowsum function. For example, if we have a matrix called M then the row sums for each column with row names can be calculated by using the command rowsum (M,row.names (M)).

How do I Group data by total units in Excel?

1 Select Group by on the Home tab. 2 Select the Advanced option, so you can select multiple columns to group by. 3 Select the Country and Sales Channel columns. 4 In the New columns section, create a new column where the name is Total units, the aggregate operation is Sum, and the column used is Units. 5 Hit OK


3 Answers

Here's a vectorized base solution

rowsum(df, row.names(x))
#      Mon Tue Wed Thurs
# Cake   2   1   1     2
# Pie    0   0   3     3

Or data.table version using keep.rownames = TRUE in order to convert your row names to a column

library(data.table)
as.data.table(x, keep.rownames = TRUE)[, lapply(.SD, sum), by = rn]
#      rn Mon Tue Wed Thurs
# 1: Cake   2   1   1     2
# 2:  Pie   0   0   3     3
like image 165
David Arenburg Avatar answered Nov 04 '22 14:11

David Arenburg


You can try this

df <- read.table(head=TRUE, text="
Name       Mon   Tue   Wed   Thurs
Cake   1     0     1      1
Pie    0     0     1      1
Cake   1     1     0      1
Pie    0     0     1      1
Pie    0     0     1      1")

aggregate(. ~ Name, data=df, FUN=sum)
##   Name Mon Tue Wed Thurs
## 1 Cake   2   1   1     2
## 2  Pie   0   0   3     3

also with dplyr

library(dplyr)
group_by(df, Name) %>%
    summarise(Mon = sum(Mon), Tue = sum(Tue), Wed = sum(Wed), Thurs = sum(Thurs))

or better

 group_by(df, Name) %>%
    summarise_each(funs(sum))
like image 37
Mamoun Benghezal Avatar answered Nov 04 '22 13:11

Mamoun Benghezal


An approach using plyr:

ldply(split(df, df$Name), function(u) colSums(u[-1]))
#   .id Mon Tue Wed Thurs
#1 Cake   2   1   1     2
#2  Pie   0   0   3     3

Data:

df = structure(list(Name = structure(c(1L, 2L, 1L, 2L, 2L), .Label = c("Cake", 
"Pie"), class = "factor"), Mon = c(1L, 0L, 1L, 0L, 0L), Tue = c(0L, 
0L, 1L, 0L, 0L), Wed = c(1L, 1L, 0L, 1L, 1L), Thurs = c(1L, 1L, 
1L, 1L, 1L)), .Names = c("Name", "Mon", "Tue", "Wed", "Thurs"
), row.names = c(NA, -5L), class = "data.frame")
like image 34
Colonel Beauvel Avatar answered Nov 04 '22 15:11

Colonel Beauvel