Assume the following data:
dat <- data.frame(x1 = c(1, 2, 3, 4, 5),
x2 = c(2, 3, 4, 5, 6),
x3 = c(3, 4, 5, 6, 7),
x4 = c(7, 2, 3, 4, 5),
x5 = c(7, 2, 1, 4, 5))
Further assume the following lookup table:
lookup_positions <- data.frame(v1 = c(1,3,5),
v2 = c(1,2,5),
v3 = c(1,3,4),
v4 = c(2,3,5))
Now, what I want to do is the following: for each row in dat
I want to go through all combinations specified in lookup_positions
and calculate the row sums for the dat
column positions specified in lookup_positions
.
So for all rows in dat
I want to calculate the row sum of dat[,c(1,3,5)]
, then I want to calculuate the row sum of dat[, c(1,2,5)]
and so on. So I basically calculate 4 row sums.
I know how to do it in base R using a loop, I also now how to do it in tidyverse approach for one row sum, but not how to do it for all versions mentioned in lookup_positions
with tidyverse without loops.
So expected outcome would be:
x1 x2 x3 x4 x5 rowsum1 rowsum2 rowsum3 rowsum4
1 1 2 3 7 7 11 10 11 12
2 2 3 4 2 2 8 7 8 9
3 3 4 5 3 1 9 8 11 10
4 4 5 6 4 4 14 13 14 15
5 5 6 7 5 5 17 16 17 18
Here's what I got for one of the lookup_positions in the tidyverse. But I'm stuck in how to generalize this for all lookup positions.
dat %>%
mutate(rowsum1 = apply(across(everything()), 1, function(x) sum(x[as.numeric(lookup_positions[1,])])))
I know for my 4 lookup positions I could simply do a copy paste and be done with it, but my real life data has a few hundred lookup position combinations.
Syntax: mutate(new-col-name = rowSums(.)) The rowSums() method is used to calculate the sum of each row and then append the value at the end of each row under the new column name specified. The argument . is used to apply the function over all the cells of the data frame. Syntax: rowSums(.)
rowwise() allows you to compute on a data frame a row-at-a-time. This is most useful when a vectorised function doesn't exist. Most dplyr verbs preserve row-wise grouping.
One dplyr
and purrr
option could be:
map2(.x = asplit(lookup_positions, 2),
.y = 1:ncol(lookup_positions),
~ dat %>%
mutate(!!paste0("rowsums", .y) := rowSums(select(., .x)))) %>%
reduce(full_join)
x1 x2 x3 x4 x5 rowsums1 rowsums2 rowsums3 rowsums4
1 1 2 3 7 7 11 10 11 12
2 2 3 4 2 2 8 7 8 9
3 3 4 5 3 1 9 8 11 10
4 4 5 6 4 4 14 13 14 15
5 5 6 7 5 5 17 16 17 18
Here is another tidyverse
solution you may be interested in
library(dplyr)
library(purrr)
library(stringr)
dat %>%
mutate(map_dfc(
lookup_positions %>% rename_with(~str_replace(., "v", "rowsum")),
~rowSums(.y[, .x]),
across(everything())
))
Output
x1 x2 x3 x4 x5 rowsum1 rowsum2 rowsum3 rowsum4
1 1 2 3 7 7 11 10 11 12
2 2 3 4 2 2 8 7 8 9
3 3 4 5 3 1 9 8 11 10
4 4 5 6 4 4 14 13 14 15
5 5 6 7 5 5 17 16 17 18
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With