Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

rowwise() sum with vector of column names in dplyr

Tags:

r

dplyr

rlang

I am once again confused about how to achieve this:

Given this data frame:

df <- tibble(
  foo = c(1,0,1),
  bar = c(1,1,1),
  foobar = c(0,1,1)
)

And this vector:

to_sum <- c("foo", "bar")

I would like to get the row-wise sum of the values in the columns to_sum.

Desired output:

# A tibble: 3 x 4
# Rowwise: 
    foo   bar foobar   sum
  <dbl> <dbl>  <dbl> <dbl>
1     1     1      0     2
2     0     1      1     1
3     1     1      1     2

Typing it out works (obviously).

df %>% rowwise() %>% 
  mutate(
    sum = sum(foo, bar)
  )

This does not:

df %>% rowwise() %>% 
  mutate(
    sum = sum(to_sum)
  )

Which I understand, because if I were to try:

df %>% rowwise() %>% 
  mutate(
    sum = sum("foo", "bar")
  )

How can I compute the row-wise sum from a vector of column names?

like image 923
MKR Avatar asked Jul 19 '21 15:07

MKR


People also ask

How to do row wise sum in dplyr?

rowSums () function takes up the columns 2 to 4 and performs the row wise operation with NA values replaced to zero. row wise sum is performed using pipe (%>%) operator of the dplyr package. view source print? Row wise sum is calculated with the help rowwise () function of dplyr package and sum () function as shown below

How do I select a column from a list in dplyr?

I encounter this problem often, and the easiest way to do this is to use the apply () function within a mutate command. Here you could use whatever you want to select the columns using the standard dplyr tricks (e.g. starts_with () or contains () ).

How to do row wise sum in R Dataframe?

Row wise sum – row sum in R dataframe 1 Row wise sum in R dataframe using rowSums () 2 Row sum of the dataframe using apply () function. 3 Row wise sum of the dataframe using dplyr package. More ...

How to sum the sum of each row of a matrix?

rowSums (): The rowSums () method calculates the sum of each row of a numeric array, matrix, or dataframe. We can select specific rows to compute the sum in this method. Since, the matrix created by default row and column names are labeled using the X1, X2.., etc. labels, we can specify them using these names.


4 Answers

I think you are looking for rlang::syms to coerce strings to quosures:

library(dplyr)
library(rlang)
df %>% 
  rowwise() %>% 
  mutate(
    sum = sum(!!!syms(to_sum))
  )
#     foo   bar foobar   sum
#   <dbl> <dbl>  <dbl> <dbl>
# 1     1     1      0     2
# 2     0     1      1     1
# 3     1     1      1     2
like image 98
user63230 Avatar answered Oct 29 '22 05:10

user63230


This might help you:

library(dplyr)
library(purrr)
library(rlang)

df %>%
  bind_cols(parse_exprs(to_sum) %>%
              map_dfc(~ eval_tidy(.x, data = df)) %>%
              rowSums()) %>%
  rename(sum = ...4)

# A tibble: 3 x 4
    foo   bar foobar   sum
  <dbl> <dbl>  <dbl> <dbl>
1     1     1      0     2
2     0     1      1     1
3     1     1      1     2
like image 32
Anoushiravan R Avatar answered Oct 29 '22 05:10

Anoushiravan R


library(janitor)
df %>%
  adorn_totals("col",,,"sum",to_sum)

 foo bar foobar sum
   1   1      0   2
   0   1      1   1
   1   1      1   2

Why ,,, ?

If you look at ?adorn_totals, you'll see its arguments:

adorn_totals(dat, where = "row", fill = "-", na.rm = TRUE, name = "Total", ...)

The final one ... is to control column selection. There's unfortunately no way to tell R directly that to_sum should be used for that ... argument, so the ,,, in this answer is telling it to use the default values for the arguments where, fill, and na.rm. At that point, it has values for every argument besides ..., so to_sum gets applied to that.

The topic is discussed further here: Specify the dots argument when calling a tidyselect-using function without needing to specify the preceding arguments

like image 21
Sam Firke Avatar answered Oct 29 '22 06:10

Sam Firke


You need to use c_across and any_of. This is how it is intended to be used by the RStudio Team: check out vignette("rowwise", package = "dplyr").

library(dplyr)

df %>% 
  rowwise() %>% 
  mutate(sum = sum(c_across(any_of(to_sum))))

#> # A tibble: 3 x 4
#> # Rowwise: 
#>     foo   bar foobar   sum
#>   <dbl> <dbl>  <dbl> <dbl>
#> 1     1     1      0     2
#> 2     0     1      1     1
#> 3     1     1      1     2

c_across is specific for rowwise operations. any_of is needed to interpret to_sum as a character vector containing column names. It works even without it but it is usually preferred to be used.

You may want to ungroup() at the end to remove the rowwise.

like image 3
Edo Avatar answered Oct 29 '22 07:10

Edo