Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use dplyr::mutate_all for rounding selected columns

Tags:

r

dplyr

I'm using the following package version

# devtools::install_github("hadley/dplyr")
> packageVersion("dplyr")
[1] ‘0.5.0.9001’

With the following tibble:

library(dplyr)
df  <- structure(list(gene_symbol = structure(1:6, .Label = c("0610005C13Rik", 
"0610007P14Rik", "0610009B22Rik", "0610009L18Rik", "0610009O20Rik", 
"0610010B08Rik"), class = "factor"), fold_change = c(1.54037, 
1.10976, 0.785, 0.79852, 0.91615, 0.87931), pvalue = c(0.5312, 
0.00033, 0, 0.00011, 0.00387, 0.01455), ctr.mean_exp = c(0.00583, 
59.67286, 83.2847, 6.88321, 14.67696, 1.10363), tre.mean_exp = c(0.00899, 
66.22232, 65.37819, 5.49638, 13.4463, 0.97043), ctr.cv = c(5.49291, 
0.20263, 0.17445, 0.46288, 0.2543, 0.39564), tre.cv = c(6.06505, 
0.28827, 0.33958, 0.53295, 0.26679, 0.52364)), .Names = c("gene_symbol", 
"fold_change", "pvalue", "ctr.mean_exp", "tre.mean_exp", "ctr.cv", 
"tre.cv"), row.names = c(NA, -6L), class = c("tbl_df", "tbl", 
"data.frame"))

That looks like this:

> df
# A tibble: 6 × 7
    gene_symbol fold_change  pvalue ctr.mean_exp tre.mean_exp  ctr.cv  tre.cv
         <fctr>       <dbl>   <dbl>        <dbl>        <dbl>   <dbl>   <dbl>
1 0610005C13Rik     1.54037 0.53120      0.00583      0.00899 5.49291 6.06505
2 0610007P14Rik     1.10976 0.00033     59.67286     66.22232 0.20263 0.28827
3 0610009B22Rik     0.78500 0.00000     83.28470     65.37819 0.17445 0.33958
4 0610009L18Rik     0.79852 0.00011      6.88321      5.49638 0.46288 0.53295
5 0610009O20Rik     0.91615 0.00387     14.67696     13.44630 0.25430 0.26679
6 0610010B08Rik     0.87931 0.01455      1.10363      0.97043 0.39564 0.52364

I'd like to round the floats (2nd columns onward) to 3 digits. What's the way to do it with dplyr::mutate_all()

I tried this:

cols <- names(df)[2:7]
# df <- df %>% mutate_each_(funs(round(.,3)), cols)
# Warning message:
#'mutate_each_' is deprecated.
# Use 'mutate_all' instead.
# See help("Deprecated") 

df <- df %>% mutate_all(funs(round(.,3)), cols)

But get the following error:

Error in mutate_impl(.data, dots) : 
  3 arguments passed to 'round'which requires 1 or 2 arguments
like image 847
neversaint Avatar asked Apr 10 '17 02:04

neversaint


People also ask

How do you round off columns in R?

Round function in R, rounds off the values in its first argument to the specified number of decimal places. Round() function in R rounds off the list of values in vector and also rounds off the column of a dataframe.

How do you round a table in R?

To round values in proportion table in R, we can first save the proportion table in an object and then use the round function.

How do I use the mutate function in R?

In R programming, the mutate function is used to create a new variable from a data set. In order to use the function, we need to install the dplyr package, which is an add-on to R that includes a host of cool functions for selecting, filtering, grouping, and arranging data.


2 Answers

While the new across() function is slightly more verbose than the previous mutate_if variant, the dplyr 1.0.0 updates make the tidyverse language and code more consistent and versatile.

This is how to round specified columns:

df %>% mutate(across(2:7, round, 3)) # columns 2-7 by position

df %>% mutate(across(cols, round, 3)) # columns specified by variable cols

This is how to round all numeric columns to 3 decimal places:

df %>% mutate(across(where(is.numeric), round, 3))

This is how to round all columns, but it won't work in this case because gene_symbol is not numeric:

df %>% mutate(across(everything(), round, 3))

Where we put where(is.numeric) in across's arguments, you could put in other column specifications such as -1 or -gene_symbol to exclude column 1. See help(tidyselect) for even more options.


Update for dplyr 1.0.0

The across() function replaces the _if/_all/_at/_each variants of dplyr verbs. https://dplyr.tidyverse.org/dev/articles/colwise.html#how-do-you-convert-existing-code


Since some columns are not numeric, you could use mutate_if with the added benefit of rounding columns iff (if and only if) it is numeric:

df %>% mutate_if(is.numeric, round, 3)

like image 85
Arthur Yip Avatar answered Sep 23 '22 19:09

Arthur Yip


packageVersion("dplyr") [1] '0.7.6' 

Try

df %>% mutate_at(2:7, funs(round(., 3)))  

It works!!

like image 24
Antonio Jesús Pérez Luque Avatar answered Sep 22 '22 19:09

Antonio Jesús Pérez Luque