I have a follow-up on this question: Sum values from rows with conditions in R
Here is my data:
ID <- c("A", "B", "C", "D", "E", "F")
Q1 <- c(0, 1, 7, 9, NA, 3)
Q2 <- c(0, 3, 2, 2, NA, 3)
Q3 <- c(0, 0, 7, 9, NA, 3)
dta <- data.frame(ID, Q1, Q2, Q3)
I need to sum every value below 7, but in lines with values over 7, I need to sum all the numbers below 7 and ignore the ones over it. Rows with all NAs should be preserved. Result should look like this:
ProxySum
0
4
2
2
NA
9
I have tried this code based on the response from the last post:
dta2 <- dta %>%
rowwise() %>%
mutate(ProxySum = ifelse(all(c_across(Q1:Q3) < 7), Reduce(`+`, c_across(Q1:Q3)), (ifelse(any(c_across(Q1:Q3) > 7), sum(.[. < 7]), NA))))
But in the rows with numbers over 7 I end up with a sum of all the rows and columns. What I am missing?
One way to do it in base:
rowSums(dta[, 2:4] * (dta[, 2:4] < 7))
# [1] 0 4 2 2 NA 9
Adding explanation, according to @tjebo comment
dta[, 2:4] < 7 you produce a dataframe populated with logical values, where TRUE or FALSE corresponds to the values which are less or greater than 7. It is possible to do in one line, since this operation is vectorized;logical types into numeric types, so all FALSE and TRUEs from your logical dataset, are converted to 0s and 1s. Which means that you multiply your original values by 1 if they are less than 7, and by 0s otherwise;NA < 7 produces NA, and following multiplication by NA will produce NAs as well - you preserve the original NAs;rowSums() on a resulting dataframe, which will sum up the values for each particular row. Since those of them that exceed 7 are turned into 0s, you exclude them from resulting sum;NA, you can use na.rm = TRUE argument to your rowSums() call. However, in this case, for the rows with NAs only you will get 0.If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With