I would like to call tidyr::gather()
inside a custom function, to which I pass a pair of character variables that will be used to rename the key
and value
columns. e.g.
myFunc <- function(mydata, key.col, val.col) {
new.data <- tidyr::gather(data = mydata, key = key.col, value = val.col)
return(new.data)
}
However, this does not work as desired.
temp.data <- data.frame(day.1 = c(20, 22, 23), day.2 = c(32, 22, 45), day.3 = c(17, 9, 33))
# Call my custom function, renaming the key and value columns
# "day" and "temp", respectively
long.data <- myFunc(mydata = temp.data, key.col = "day", val.col = "temp")
# Columns have *not* been renamed as desired
head(long.data)
key.col val.col
1 day.1 20
2 day.1 22
3 day.1 23
4 day.2 32
5 day.2 22
6 day.2 45
Desired output:
head(long.data)
day temp
1 day.1 20
2 day.1 22
3 day.1 23
4 day.2 32
5 day.2 22
6 day.2 45
My understanding is that gather()
uses bare variable names for most arguments (as it has in this example, using "key.col"
as the column name as opposed to the value stored in key.col
). I have attempted a number of ways of passing a value in the gather()
call, but most return errors. For example, these three variants of the gather()
call within myFunc
return Error: Invalid column specification
(ignoring, for illustrative purposes, the value
parameter, which has identical behavior):
gather(data = mydata, key = as.character(key.col) value = val.col)
gather(data = mydata, key = as.name(key.col) value = val.col)
gather(data = mydata, key = as.name(as.character(key.col)) value = val.col)
As a workaround, I just rename the columns following the call to gather()
:
colnames(long.data)[colnames(long.data) == "key"] <- "day"
But given gather()
's purported functionality for renaming the key/value columns, how can I do this in the gather()
call within a custom function?
To put it in a function you have to use gather_()
like so.
myFunc <- function(mydata, key.col, val.col, gather.cols) {
new.data <- gather_(data = mydata,
key_col = key.col,
value_col = val.col,
gather_cols = colnames(mydata)[gather.cols])
return(new.data)
}
temp.data <- data.frame(day.1 = c(20, 22, 23), day.2 = c(32, 22, 45),
day.3 = c(17, 9, 33))
temp.data
day.1 day.2 day.3
1 20 32 17
2 22 22 9
3 23 45 33
# Call my custom function, renaming the key and value columns
# "day" and "temp", respectively
long.data <- myFunc(mydata = temp.data, key.col = "day", val.col =
"temp", gather.cols = 1:3)
# Columns *have* been renamed as desired
head(long.data)
day temp
1 day.1 20
2 day.1 22
3 day.1 23
4 day.2 32
5 day.2 22
6 day.2 45
As stated, the main difference is in gather_
you have to specify the columns you want to gather up with the gather_cols
argument.
...and having had the same question, I now found the answer here: https://dplyr.tidyverse.org/articles/programming.html
You can have dplyr evaluate symbols by setting them off with exclamation marks. In your original question, the code would be:
gather(data = mydata, key = !!key.col value = !!val.col)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With