Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pass column name to function from mutate_each

Tags:

r

dplyr

I'd like to apply a transformation to all columns via dplyr::mutate_each, e.g.

library(dplyr)
mult <- function(x,m) return(x*m)
mtcars %>% mutate_each(funs(mult(.,2)))    # Multiply all columns by a factor of two

However, the transformation should have parameters depending on the column name. Therefore, the column name should be passed to the function as an additional argument

named.mult <- function(x,colname) return(x*param.A[[colname]])

Example: multiply every column by a different factor:

param.A <- c()
param.A[names(mtcars)] <- seq(length(names(mtcars)))
param.A
# mpg  cyl disp   hp drat   wt qsec   vs   am gear carb 
#   1    2    3    4    5    6    7    8    9   10   11 

Since the column name gets lost during mutate_each, I currently work around this by passing a list with lazy evalution to mutate_ (the SE version):

library(lazyeval)
named.mutate <- function(fun, cols) sapply(cols, function(n) interp(~fun(col, n), fun=fun, col=as.name(n)))
mtcars %>% mutate_(.dots=named.mutate(named.mult, names(.)))

Works, but is there some special variable like .name which contains the column name of . for each colwise execution? So I could do something like

mtcars %>% mutate_each(funs(named.mult(.,.name)))
like image 675
DeltaKappa Avatar asked Oct 22 '15 13:10

DeltaKappa


1 Answers

I'd suggest taking a different approach. Instead of using mutate_each a combination of dplyr::mutate with tidyr::gather and tidyr::spread can achieve the same result.

For example:

library(dplyr)
library(tidyr)

data(mtcars)

# Multiple each column by a different interger
mtcars %>% 
  dplyr::tbl_df() %>%
  dplyr::mutate(make_and_model = rownames(mtcars)) %>%
  tidyr::gather(key, value, -make_and_model) %>% 
  dplyr::mutate(m = as.integer(factor(key)),   # a multiplication factor dependent on column name
                value = value * m) %>% 
  dplyr::select(-m) %>%
  tidyr::spread(key, value)

# compare to the original data
mtcars[order(rownames(mtcars)), order(names(mtcars))]

# the muliplicative values used.
mtcars %>% 
  tidyr::gather() %>% 
  dplyr::mutate(m = as.integer(factor(key))) %>% 
  dplyr::select(-value) %>%
  dplyr::distinct()
like image 122
Peter Avatar answered Oct 12 '22 23:10

Peter