If I have a dataframe:
d = data.frame(sample=c("a2","a3"),a=c(1,5),b=c(4,5),c=c(6,4))
d
sample a b c
1 a2 1 4 6
2 a3 5 5 4
How do I divide the sum of each column by the sum of the entire dataframe using dplyr so I end up with a dataframe that looks like:
a b c
1 6/25 9/25 10/25
I tried to do
d <- d %>%
mutate_if(is.numeric, funs(colSums(d)/sum(d)))
but keeps returning erroring.
Thanks in advance!
Except for 2a and 2b, in each of these alternatives we could replace the first two components of the pipeline with d[-1] if it is ok to assume that we know that only the first column is non-numeric.
1) Base R With base R we get a straight forward solution:
d |> Filter(f = is.numeric) |> colSums() |> prop.table()
## a b c
## 0.24 0.36 0.40
2) dplyr With dplyr:
library(dplyr)
d %>%
select(where(is.numeric)) %>%
summarize(across(.fn = sum) / sum(.))
## a b c
## 1 0.24 0.36 0.4
2a) or
d %>%
summarize(across(where(is.numeric), sum)) %>%
{ . / sum(.) }
2b) The scoped functions such as the *_if functions are not used these days having been superseded by across but they are still available so if you want to use them anyways then try this which is close to the code in the question:
d %>%
summarize_if(is.numeric, sum) %>%
{ . / sum(.) }
3) collapse With the collapse package, get the numeric variables (nv), sum each column (fsum) and then take proportions. When I benchmarked it on this data it ran 3x faster than (1), over 100x faster than (2) and 300x faster than (4).
library(collapse)
d |> nv() |> fsum() |> fsum(TRA = "/")
## a b c
## 0.24 0.36 0.40
4) dplyr/tidyr With tidyr and dplyr we can convert to long form, process and convert back.
library(dplyr)
library(tidyr)
d %>%
select(where(is.numeric)) %>%
pivot_longer(everything()) %>%
group_by(name) %>%
summarize(value = sum(value) / sum(.$value), .groups = "drop") %>%
pivot_wider
## # A tibble: 1 x 3
## a b c
## <dbl> <dbl> <dbl>
## 1 0.24 0.36 0.4
We could use colSums
and the sum
of colSums
. -1
excludes column1 for calculation
result <- colSums(d[,-1])/sum(colSums(d[,-1]))
result
Output:
a b c
0.24 0.36 0.40
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With