Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

dplyr string as column reference

Tags:

r

dplyr

Is there anyway to pass a string as column reference to a dplyr procedure?

Here is an example - with a grouped dataset and a simple function where I try to pass a string as reference to a column. Thanks!

machines <- data.frame(Date=c("1/31/2014", "1/31/2014", "2/28/2014", "2/28/2014", "3/31/2014", "3/31/2014"), 
            Model.Num=c("123", "456", "123", "456", "123", "456"), 
            Cost=c(200, 300, 250, 350, 300, 400))

my.fun <- function(data, colname){
    mutate(data, position=cumsum(as.name(colname)))
}

machines <- machines %>% group_by(Date, Model.Num)     
machines <- my.fun(machines, "Cost")
like image 520
user1839897 Avatar asked Jan 17 '15 00:01

user1839897


2 Answers

Here's an option that uses interp() from the lazyeval package, which came with your dplyr install. Inside your function(s), you'll need to use the standard evaluation version of the dplyr functions. In this case that would be mutate_().

Note that the new column position will be identical to the Cost column here because of how you've set up the grouping in machines. The second call to my_fun() shows it working on a different set of grouping variables.

library(dplyr)
library(lazyeval)

my_fun <- function(data, col) {
    mutate_(data, position = interp(~ cumsum(x), x = as.name(col)))
}

my_fun(machines, "Cost")
#        Date Model.Num Cost position
# 1 1/31/2014       123  200      200
# 2 1/31/2014       456  300      300
# 3 2/28/2014       123  250      250
# 4 2/28/2014       456  350      350
# 5 3/31/2014       123  300      300
# 6 3/31/2014       456  400      400

## second example - different grouping
my_fun(group_by(machines, Model.Num), "Cost")
#        Date Model.Num Cost position
# 1 1/31/2014       123  200      200
# 2 1/31/2014       456  300      300
# 3 2/28/2014       123  250      450
# 4 2/28/2014       456  350      650
# 5 3/31/2014       123  300      750
# 6 3/31/2014       456  400     1050
like image 103
Rich Scriven Avatar answered Oct 18 '22 16:10

Rich Scriven


We can evaluate in standard evaluation without the use of lazyeval package. We can set some string as variable name by using setNames.

library(tidyverse)

machines <- data.frame(
  Date = c("1/31/2014", "1/31/2014", "2/28/2014", "2/28/2014", "3/31/2014", "3/31/2014"), 
  Model.Num = c("123", "456", "123", "456", "123", "456"), 
  Cost = c(200, 300, 250, 350, 300, 400)
  )

my_fun <- function(data, col) {
  mutate_(data, .dots = setNames(paste0("cumsum(", col, ")"), "position"))
}

my_fun(machines %>% group_by(Date, Model.Num), "Cost")
# Source: local data frame [6 x 4]
# Groups: Date, Model.Num [6]
# 
# Date Model.Num  Cost position
# <fctr>    <fctr> <dbl>    <dbl>
# 1 1/31/2014       123   200      200
# 2 1/31/2014       456   300      300
# 3 2/28/2014       123   250      250
# 4 2/28/2014       456   350      350
# 5 3/31/2014       123   300      300
# 6 3/31/2014       456   400      400
my_fun(machines %>% group_by(Model.Num), "Cost")
# Source: local data frame [6 x 4]
# Groups: Model.Num [2]
# 
# Date Model.Num  Cost position
# <fctr>    <fctr> <dbl>    <dbl>
# 1 1/31/2014       123   200      200
# 2 1/31/2014       456   300      300
# 3 2/28/2014       123   250      450
# 4 2/28/2014       456   350      650
# 5 3/31/2014       123   300      750
# 6 3/31/2014       456   400     1050
like image 45
Keiku Avatar answered Oct 18 '22 14:10

Keiku