Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cannot use multi word variables in dplyr or am I missing something?

Tags:

function

r

dplyr

Why doesn't dplyr like this format of 'beta linalool' in my function as compared to beta.linalool?

It took me a few hours of troubleshooting to figure out what the problem was. Is there any way to use data where variables are labeled as more than one word or should I just move everything to the beta.linalool type format?

Everything I have learned has been from Programming with dplyr.

library(ggplot2)
library(readxl)
library(dplyr)
library(magrittr)

Data3<- read_excel("Desktop/Data3.xlsx")

Data3 %>% filter(Variety=="CS 420A"&`Red Blotch`=="-")%>% group_by(`Time Point`)%>%
  summarise(m=mean(`beta linalool`),SD=sd(`beta linalool`))
# A tibble: 4 x 3
  `Time Point`       m         SD
  <chr>           <dbl>      <dbl>
1 End          0.00300  0.000117  
2 Mid          0.00385  0.000353  
3 Must         0.000254 0.00000633
4 Start        0.000785 0.000283  

Now when I work it into a function:

cwine<-function(df,v,rb,c){
  c<-enquo(c)
  df %>% filter(Variety==v&`Red Blotch`==rb)%>% 
    group_by(`Time Point`) %>%
    summarise_(m=mean(!!c),SD=sd(!!c)) %>% 
}
cwine(Data3,"CS 420A","-",'beta linalool')
# A tibble: 4 x 3
  `Time Point`     m    SD
  <chr>        <dbl> <dbl>
1 End             NA    NA
2 Mid             NA    NA
3 Must            NA    NA
4 Start           NA    NA
Warning messages:
1: In mean.default(~"beta linalool") :
  argument is not numeric or logical: returning NA #this statement is repeated 4 more times
5: In var(if (is.vector(x) || is.factor(x)) x else as.double(x), na.rm = na.rm) :
  NAs introduced by coercion #this statement is repeated 4 more times

The problem lies in that beta linalool is typed in as 'beta linalool'. I figured this out by trying this methodology on the iris dataset and seeing that Petal.Length is not 'Petal Width':

my_function<-function(ds,x,y,c){
  c<-enquo(c)
  ds %>%filter(Sepal.Length>x&Sepal.Width<y) %>% 
    group_by(Species) %>% 
    summarise(m=mean(!!c),SD=sd(!!c))
}
my_function2(iris,5,4,Petal.Length)
# A tibble: 3 x 3
  Species        m    SD
  <fct>      <dbl> <dbl>
1 setosa      1.53 0.157
2 versicolor  4.32 0.423
3 virginica   5.57 0.536

In fact my function works fine on a different variable:

> cwine(Data2,"CS 420A","-",nerol)
# A tibble: 4 x 3
  `Time Point`        m        SD
  <chr>           <dbl>     <dbl>
1 End          0.000453 0.0000338
2 Mid          0.000659 0.0000660
3 Must         0.000560 0.0000234
4 Start        0.000927 0.0000224

Is dplyr just that sensitive or am I missing something?

like image 265
andrewjc Avatar asked May 05 '19 05:05

andrewjc


1 Answers

One option would be convert it to symbol and evaluate it

library(tidyverse)
cwine <- function(df,v,rb,c){
  
  df %>% 
      filter(Variety==v & `Red Blotch` == rb)%>% 
      group_by(`Time Point`) %>%
       summarise(m = mean(!!rlang::sym(c)),
                 SD = sd(!! rlang::sym(c))) 
}

cwine(Data3,"CS 420A","-",'beta linalool')
# A tibble: 2 x 3
#  `Time Point`       m    SD
#         <int>   <dbl> <dbl>
#1            2 -2.11    2.23
#2            4  0.0171 NA  

Also, if we want to pass it by converting to quosure (enquo), it works, when we pass the variable name with backquotes (usually, unquoted version works, but here there is a space between words and to evaluate it as it is, backquote is needed)

cwine <- function(df,v,rb,c){
  c1 <- enquo(c)
  df %>% 
      filter(Variety==v & `Red Blotch` == rb)%>% 
      group_by(`Time Point`) %>%
       summarise(m = mean(!! c1 ),
                 SD = sd(!! c1)) 
}

cwine(Data3,"CS 420A","-",`beta linalool`)
# A tibble: 2 x 3
#   `Time Point`       m    SD
#         <int>   <dbl> <dbl>
#1            2 -2.11    2.23
#2            4  0.0171 NA   

data

set.seed(24)
Data3 <- tibble(Variety = sample(c("CS 420A", "CS 410A"), 20, replace = TRUE),
`Red Blotch` = sample(c("-", "+"), 20, replace = TRUE), 
`Time Point` = sample(1:4, 20, replace = TRUE),
`beta linalool` = rnorm(20))
like image 81
akrun Avatar answered Oct 16 '22 10:10

akrun