I am wondering if there is any easy way to specify the number of digits reported by summarise
in dplyr
, ideally using a native dplyr
or other tidyverse
function?
Here's some toy data
library(dplyr)
df <- data.frame(group = rep(letters[1:2], each = 10, length.out = 40),
large = rnorm(40, 100, 15),
small = rnorm(40, 0.5, 0.02))
If we then summarise via
df %>% group_by(group) %>% summarise(mL = mean(large), mS = mean(small))
We get
# group mL mS
# <fct> <dbl> <dbl>
# 1 a 104. 0.496
# 2 b 97.6 0.506
Note that without specifying any rounding the variable with the higher mean has been rounded to 1 decimal place and the variable with the smaller mean has been rounded to 3.
Now want if we want the variable with the larger mean to also be reported to 3 decimal places? If we include a command to round like so
df %>% group_by(group) %>% summarise(mL = round(mean(large),3), mS = mean(small))
There is no change in the output
# group mL mS
# <fct> <dbl> <dbl>
# 1 a 104. 0.496
# 2 b 97.6 0.506
Only if we use the format()
function can we obtain what we are after
df %>% group_by(group) %>% summarise(mL = format(round(mean(large),3),3), mS = mean(small))
group mL mS
<fct> <chr> <dbl>
1 a 103.888 0.496
2 b 97.626 0.506
Is there an easier way to do this? Ideally using some kind of tidyverse function.
To round the output of summary function in R, we can use digits argument while applying the summary function.
summarise() creates a new data frame. It will have one (or more) rows for each combination of grouping variables; if there are no grouping variables, the output will have a single row summarising all observations in the input.
This is to do with the way tibbles are printed. The actual numbers in the data frame still have all the decimal places they are just not displayed when printing the tibble.
You can use as.data.frame
or print.data.frame()
which will show you more decimal points (depending on your getOption("digits")
). You can also change the tibble settings but my understanding is that these are always based on significant figures rather than decimal points (so your values >100 will have fewer decimal points than values <100) See
https://tibble.tidyverse.org/reference/formatting.html for tibble printing options
So
df %>% group_by(group) %>% summarise(mL = round(mean(large),3), mS = round(mean(small),3)) %>%
as.data.frame()
will give you values to 3 decimal places, and
df %>% group_by(group) %>% summarise(mL = mean(large), mS = mean(small)) %>%
as.data.frame()
will show to getOption("digits")
decimal places (I think 7 is default).
Also note if you do want to do the same thing to multiple columns in summarise, summarise_at()
can be very helpful, e.g.
df %>% group_by(group) %>% summarise_at(c("large","small"), ~round(mean(.),3)) %>%
print.data.frame()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With