Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pass variable to tidyr's gather to rename key/value columns?

Tags:

r

tidyr

I would like to call tidyr::gather() inside a custom function, to which I pass a pair of character variables that will be used to rename the key and value columns. e.g.

myFunc <- function(mydata, key.col, val.col) {
    new.data <- tidyr::gather(data = mydata, key = key.col, value = val.col)
    return(new.data)    
}

However, this does not work as desired.

temp.data <- data.frame(day.1 = c(20, 22, 23), day.2 = c(32, 22, 45), day.3 = c(17, 9, 33))

# Call my custom function, renaming the key and value columns 
# "day" and "temp", respectively
long.data <- myFunc(mydata = temp.data, key.col = "day", val.col = "temp")

# Columns have *not* been renamed as desired
head(long.data)
  key.col val.col
1   day.1      20
2   day.1      22
3   day.1      23
4   day.2      32
5   day.2      22
6   day.2      45

Desired output:

head(long.data)
    day temp
1 day.1   20
2 day.1   22
3 day.1   23
4 day.2   32
5 day.2   22
6 day.2   45

My understanding is that gather() uses bare variable names for most arguments (as it has in this example, using "key.col" as the column name as opposed to the value stored in key.col). I have attempted a number of ways of passing a value in the gather() call, but most return errors. For example, these three variants of the gather() call within myFunc return Error: Invalid column specification (ignoring, for illustrative purposes, the value parameter, which has identical behavior):

gather(data = mydata, key = as.character(key.col) value = val.col)

gather(data = mydata, key = as.name(key.col) value = val.col)

gather(data = mydata, key = as.name(as.character(key.col)) value = val.col)

As a workaround, I just rename the columns following the call to gather():

colnames(long.data)[colnames(long.data) == "key"] <- "day"

But given gather()'s purported functionality for renaming the key/value columns, how can I do this in the gather() call within a custom function?

like image 758
Jeff Avatar asked Jun 10 '16 19:06

Jeff


2 Answers

To put it in a function you have to use gather_() like so.

myFunc <- function(mydata, key.col, val.col, gather.cols) {
  new.data <- gather_(data = mydata,
                      key_col = key.col,
                      value_col = val.col,
                      gather_cols = colnames(mydata)[gather.cols])
  return(new.data)    
}

temp.data <- data.frame(day.1 = c(20, 22, 23), day.2 = c(32, 22, 45),
day.3 = c(17, 9, 33))
temp.data


     day.1 day.2 day.3
1    20    32    17
2    22    22     9
3    23    45    33

# Call my custom function, renaming the key and value columns 
# "day" and "temp", respectively

long.data <- myFunc(mydata = temp.data, key.col = "day", val.col =   
"temp", gather.cols = 1:3)
# Columns *have* been renamed as desired
head(long.data)

  day temp
1 day.1   20
2 day.1   22
3 day.1   23
4 day.2   32
5 day.2   22
6 day.2   45

As stated, the main difference is in gather_ you have to specify the columns you want to gather up with the gather_cols argument.

like image 87
Bryan Goggin Avatar answered Jan 02 '23 12:01

Bryan Goggin


...and having had the same question, I now found the answer here: https://dplyr.tidyverse.org/articles/programming.html

You can have dplyr evaluate symbols by setting them off with exclamation marks. In your original question, the code would be:

gather(data = mydata, key = !!key.col value = !!val.col)
like image 36
invertdna Avatar answered Jan 02 '23 13:01

invertdna