Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R dplyr method mutate variable if exists

Tags:

r

dplyr

As a big fan of dplyr and its tidy data concept, I would like to mutate a specific variable whenever it exists in a dataframe. This is the idea:

# Load libraries
library(dplyr)

# Create data frames
df1 <- data.frame(year = 2000:2010, foo = 0:10)
df2 <- data.frame(year = 2000:2010)

# Create function
cnd_mtt <- function(df){
  df %>%
    mutate_if(colname == "foo", as.factor) # <---- this is the tricky part
}

Expected result: the function should work for both data frames and without error

Ideas?

like image 703
Fierr Avatar asked Dec 13 '22 18:12

Fierr


2 Answers

You can use mutate_at with one_of which raises a warning message if the column doesn't exist:

cnd_mtt <- function(df){
    df %>%
        mutate_at(vars(one_of('foo')), as.factor)
}

cnd_mtt(df2)
#   year
#1  2000
#2  2001
#3  2002
#4  2003
#5  2004
#6  2005
#7  2006
#8  2007
#9  2008
#10 2009
#11 2010
Warning message:
Unknown variables: `foo`

Just to clarify, the warning message is raised by one_of when it fails to resolve the column name from the vars variable:

one_of('foo', vars = names(df1))
# [1] 2
one_of('foo', vars = names(df2))
# integer(0)
Warning message:
Unknown variables: `foo`

In case you want to further get rid of the warning message, take @Gregor's comment, you can use mutate_at with if/else, and returns integer(0) if foo doesn't exist in the columns:

df2 %>% 
    mutate_at(if('foo' %in% names(.)) 'foo' else integer(0), as.factor)

#   year
#1  2000
#2  2001
#3  2002
#4  2003
#5  2004
#6  2005
#7  2006
#8  2007
#9  2008
#10 2009
#11 2010
like image 124
Psidom Avatar answered Dec 17 '22 23:12

Psidom


Building on Psidom answer, you can also use quietly to avoid the warning:

df2 %>%
  mutate_at(vars(quietly(one_of)("foo","boo",  .vars = tidyselect::peek_vars())$result),
            as.factor)
like image 44
Matifou Avatar answered Dec 17 '22 23:12

Matifou