Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Efficiently sum across multiple columns in R

Tags:

r

sum

I have the following condensed data set:

a<-as.data.frame(c(2000:2005)) a$Col1<-c(1:6) a$Col2<-seq(2,12,2)  colnames(a)<-c("year","Col1","Col2")  for (i in 1:2){   a[[paste("Var_", i, sep="")]]<-i*a[[paste("Col", i, sep="")]] } 

I would like to sum the columns Var1 and Var2, which I use:

a$sum<-a$Var_1 + a$Var_2 

In reality my data set is much larger - I would like to sum from Var_1 to Var_n (n can be upto 20). There must be a more efficient way to do this than:

 a$sum<-a$Var_1 + ... + a$Var_n 
like image 422
user2568648 Avatar asked Mar 12 '15 09:03

user2568648


People also ask

How do I sum across observations in R?

The rowSums() function in R can be used to calculate the sum of the values in each row of a matrix or data frame in R. where: x: Name of the matrix or data frame.

How do I sum across rows in R dplyr?

Syntax: mutate(new-col-name = rowSums(.)) The rowSums() method is used to calculate the sum of each row and then append the value at the end of each row under the new column name specified. The argument . is used to apply the function over all the cells of the data frame.

How do I find the sum of specific columns in R?

Often you may want to find the sum of a specific set of columns in a data frame in R. Fortunately this is easy to do using the rowSums() function. This tutorial shows several examples of how to use this function in practice. Example 1: Find the Sum of Specific Columns

What is the sum of values in the second row?

The sum of values in the second row across all three columns is 12. And so on. You can find more R tutorials here.

How to calculate the sum of rows of a data frame subset?

To efficiently calculate the sum of the rows of a data frame subset, we can use the rowSums function as shown below: The result of the addition of the variables x1, x2, and x4 is shown in the RStudio console. Do you want to learn more about sums and data frames in R?

How many rows and columns are there in a data frame?

As you can see based on Table 1, our example data is a data frame consisting of five rows and four columns. All the variables are numeric. In this example, I’ll explain how to get the sum across two columns of our data frame. After executing the previous R code, the result is shown in the RStudio console.


2 Answers

Here's a solution using the tidyverse. You can extend it to as many columns as you like using the select() function to select the appropriate columns within a mutate().

library(tidyverse)  a<-as.data.frame(c(2000:2005)) a$Col1<-c(1:6) a$Col2<-seq(2,12,2)  colnames(a)<-c("year","Col1","Col2")  for (i in 1:2){     a[[paste("Var_", i, sep="")]]<-i*a[[paste("Col", i, sep="")]] } a #>   year Col1 Col2 Var_1 Var_2 #> 1 2000    1    2     1     4 #> 2 2001    2    4     2     8 #> 3 2002    3    6     3    12 #> 4 2003    4    8     4    16 #> 5 2004    5   10     5    20 #> 6 2005    6   12     6    24  # Tidyverse solution a %>%     mutate(Total = select(., Var_1:Var_2) %>% rowSums(na.rm = TRUE)) #>   year Col1 Col2 Var_1 Var_2 Total #> 1 2000    1    2     1     4     5 #> 2 2001    2    4     2     8    10 #> 3 2002    3    6     3    12    15 #> 4 2003    4    8     4    16    20 #> 5 2004    5   10     5    20    25 #> 6 2005    6   12     6    24    30 

Created on 2019-01-01 by the reprex package (v0.2.1)

like image 69
Matt Dancho Avatar answered Sep 19 '22 16:09

Matt Dancho


You can use colSums(a[,c("Var1", "Var2")]) or rowSums(a[,c("Var_1", "Var_2")]). In your case you want the latter.

like image 24
psoares Avatar answered Sep 22 '22 16:09

psoares