Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

mutate_at in R with lambda function?

I have a dataframe with 100 columns. Each column represent a probability value.

I want to do scaling there and I am using the following transformation:

df <- df %>%
      mutate_at(vars(specific_columns), 
                funs(function(x) {((x - min(x)) / (max(x) - min(x)))}))

But it doesn't work and doesn't produce the output I want.

For example the sample data is:

col1        col2        col3        col4        col5        
0.014492754 0.014492754 0.014492754 0.014492754 0.014492754 
0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 
0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 
0.028985507 0.028985507 0.028985507 0.028985507 0.028985507 
0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 
0.014492754 0.014492754 0.014492754 0.014492754 0.014492754 
0.014492754 0.014492754 0.014492754 0.014492754 0.014492754 
0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 
0.010989011 0.010989011 0.010989011 0.010989011 0.010989011 
0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 

Error:

Error in mutate_impl(.data, dots) : Column col1 is of unsupported type function

like image 273
SteveS Avatar asked May 13 '18 07:05

SteveS


1 Answers

Try this syntax instead:

library(dplyr)
df %>% mutate_at(vars(everything()), funs(((. - min(.)) / (max(.) - min(.)))))
#>         col1      col2      col3      col4      col5
#> 1  0.5000000 0.5000000 0.5000000 0.5000000 0.5000000
#> 2  0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
#> 3  0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
#> 4  1.0000000 1.0000000 1.0000000 1.0000000 1.0000000
#> 5  0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
#> 6  0.5000000 0.5000000 0.5000000 0.5000000 0.5000000
#> 7  0.5000000 0.5000000 0.5000000 0.5000000 0.5000000
#> 8  0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
#> 9  0.3791209 0.3791209 0.3791209 0.3791209 0.3791209
#> 10 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000

funs() will interpret a pseudo-function for you. It helps deal with two cases which would not otherwise work:

  1. Character name of a function (eg. "mean")
  2. A call to a function with . as a dummy argument (as in my example)

If you have already declared your own (anonymous) function, there is no need to use funs() since mutate_at() will accept this as-is:

mutate_at(df, vars(everything()), function(x) {((x - min(x)) / (max(x) - min(x)))})

or

my_func <- function(x) {((x - min(x)) / (max(x) - min(x)))}
mutate_at(df, vars(everything()), my_func)
like image 180
ruaridhw Avatar answered Oct 01 '22 02:10

ruaridhw