I want to sum up all but one numerical column in this dataframe.
Group, Registered, Votes, Beans
A, 111, 12, 100
A, 111, 13, 200
A, 111, 14, 300
I want to group this by Group
, summing up all the columns except Registered
.
summarise_if(
.tbl = group_by(
.data = x,
Precinct
),
.predicate = is.numeric,
.funs = sum
)
Problem here is the result is a data frame that sums ALL the numeric columns, including Registered
. How do I sum all but Registered
?
The output I want would look like this
Group, Registered, Votes, Beans
A, 111, 39, 600
Press "Ctrl + Space" to select it, then hold "Shift" and using the lateral arrow keys to select the other columns. After selecting all the columns you want to add together, the bar should display a formula such as "=SUM(A:C)," with the range displaying the column letter names.
summary statistic is computed using summary() function in R. summary() function is automatically applied to each column. The format of the result depends on the data type of the column. If the column is a numeric variable, mean, median, min, max and quartiles are returned.
I would use summarise_at
, and just make a logical vector which is FALSE
for non-numeric columns and Registered
and TRUE
otherwise, i.e.
df %>%
summarise_at(which(sapply(df, is.numeric) & names(df) != 'Registered'), sum)
If you wanted to just summarise all but one column you could do
df %>%
summarise_at(vars(-Registered), sum)
but in this case you have to check if it's numeric also.
Notes:
factors are technically numeric, so if you want to exclude non-numeric columns and factors, replace sapply(df, is.numeric)
with sapply(df, function(x) is.numeric(x) & !is.factor(x))
If your data is big I think it is faster to use sapply(df[1,], is.numeric)
instead of sapply(df, is.numeric)
. (Someone please correct me if I'm wrong)
Edit:
Modified versions of the two methods above for dplyr version >= 1, since summarise_at
is superseded
df %>%
summarise(across(where(is.numeric) & !Registered, sum))
df %>%
summarise(across(-Registered, sum))
We can use summarise_if
library(dplyr)
df %>%
select(-Registered) %>%
summarise_if(is.numeric, sum)
# Votes Beans
#1 39 600
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With