I want to pass arrange()
{dplyr} a vector of variable names to sort on. Usually I just type in the variables I want, but I'm trying to make a function where the sorting variables can be input as a function parameter.
df <- structure(list(var1 = c(1L, 2L, 2L, 3L, 1L, 1L, 3L, 2L, 4L, 4L ), var2 = structure(c(10L, 1L, 8L, 3L, 5L, 4L, 7L, 9L, 2L, 6L ), .Label = c("b", "c", "f", "h", "i", "o", "s", "t", "w", "x" ), class = "factor"), var3 = c(7L, 5L, 5L, 8L, 5L, 8L, 6L, 7L, 5L, 8L), var4 = structure(c(8L, 5L, 1L, 4L, 7L, 4L, 3L, 6L, 9L, 2L), .Label = c("b", "c", "d", "e", "f", "h", "i", "w", "y"), class = "factor")), .Names = c("var1", "var2", "var3", "var4"), row.names = c(NA, -10L), class = "data.frame") # this is the normal way to arrange df with dplyr df %>% arrange(var3, var4) # but none of these (below) work for passing a vector of variables vector_of_vars <- c("var3", "var4") df %>% arrange(vector_of_vars) df %>% arrange(get(vector_of_vars)) df %>% arrange(eval(parse(text = paste(vector_of_vars, collapse = ", "))))
arrange() orders the rows of a data frame by the values of selected columns.
The dplyr function arrange() can be used to reorder (or sort) rows by one or more variables. Instead of using the function desc(), you can prepend the sorting variable by a minus sign to indicate descending order, as follow. If the data contain missing values, they will always come at the end.
To sort a data frame in R, use the order( ) function. By default, sorting is ASCENDING. Prepend the sorting variable by a minus sign to indicate DESCENDING order.
The arrange() function in R programming is used to reorder the rows of a data frame/table by using column names. These columns are passed as the expression in the function.
Hadley hasn't made this obvious in the help file--only in his NSE vignette. The versions of the functions followed by underscores use standard evaluation, so you pass them vectors of strings and the like.
If I understand your problem correctly, you can just replace arrange()
with arrange_()
and it will work.
Specifically, pass the vector of strings as the .dots
argument when you do it.
> df %>% arrange_(.dots=c("var1","var3")) var1 var2 var3 var4 1 1 i 5 i 2 1 x 7 w 3 1 h 8 e 4 2 b 5 f 5 2 t 5 b 6 2 w 7 h 7 3 s 6 d 8 3 f 8 e 9 4 c 5 y 10 4 o 8 c
========== Update March 2018 ==============
Using the standard evaluation versions in dplyr as I have shown here is now considered deprecated. You can read Hadley's programming vignette for the new way. Basically you will use !!
to unquote one variable or !!!
to unquote a vector of variables inside of arrange()
.
When you pass those columns, if they are bare, quote them using quo()
for one variable or quos()
for a vector. Don't use quotation marks. See the answer by Akrun.
If your columns are already strings, then make them names using rlang::sym()
for a single column or rlang::syms()
for a vector. See the answer by Christos. You can also use as.name()
for a single column. Unfortunately as of this writing, the information on how to use rlang::sym()
has not yet made it into the vignette I link to above (eventually it will be in the section on "variadic quasiquotation" according to his draft).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With