Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to do rowwise summation over selected columns using column index with dplyr?

Tags:

r

dplyr

In dplyr, how do you perform rowwise summation over selected columns (using column index)?

This doesn't work

> iris  %>% mutate(sum=sum(.[1:4])) %>% head
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species    sum
1          5.1         3.5          1.4         0.2  setosa 2078.7
2          4.9         3.0          1.4         0.2  setosa 2078.7
3          4.7         3.2          1.3         0.2  setosa 2078.7
4          4.6         3.1          1.5         0.2  setosa 2078.7
5          5.0         3.6          1.4         0.2  setosa 2078.7
6          5.4         3.9          1.7         0.4  setosa 2078.7

I can do the following, but it's not beautiful

> iris %>% mutate(index=1:n()) %>%  
                gather("param", "value", 1:4)  %>% 
                group_by(index) %>% 
                mutate(sum=sum(value)) %>% 
                spread(param, value) %>% select(-index)
Source: local data frame [150 x 6]

   Species  sum Sepal.Length Sepal.Width Petal.Length Petal.Width
1   setosa 10.2          5.1         3.5          1.4         0.2
2   setosa  9.5          4.9         3.0          1.4         0.2
3   setosa  9.4          4.7         3.2          1.3         0.2
4   setosa  9.4          4.6         3.1          1.5         0.2
5   setosa 10.2          5.0         3.6          1.4         0.2
6   setosa 11.4          5.4         3.9          1.7         0.4
7   setosa  9.7          4.6         3.4          1.4         0.3
8   setosa 10.1          5.0         3.4          1.5         0.2
9   setosa  8.9          4.4         2.9          1.4         0.2
10  setosa  9.6          4.9         3.1          1.5         0.1
..     ...  ...          ...         ...          ...         ...

Is there more syntactically nicer way to achieve this?

EDIT: It's different from other questions, because I want to do rowwise operation on the columns selected by using column indices"

like image 757
Alby Avatar asked Jul 02 '15 19:07

Alby


2 Answers

As already said in the comment, you can accomplish your task with:

iris %>% mutate(sum=Reduce("+",.[1:4]))

In this case also the base rowSums works:

iris$sum<-rowSums(iris[,1:4])
like image 73
nicola Avatar answered Oct 01 '22 23:10

nicola


You can (ab)use base R's subset, which allows selection of columns by number:

iris %>% subset(select=1:4) %>% mutate(sum=rowSums(.))
like image 40
Hong Ooi Avatar answered Oct 02 '22 01:10

Hong Ooi