Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pass a vector of variable names to arrange() in dplyr

I want to pass arrange() {dplyr} a vector of variable names to sort on. Usually I just type in the variables I want, but I'm trying to make a function where the sorting variables can be input as a function parameter.

df <- structure(list(var1 = c(1L, 2L, 2L, 3L, 1L, 1L, 3L, 2L, 4L, 4L   ), var2 = structure(c(10L, 1L, 8L, 3L, 5L, 4L, 7L, 9L, 2L, 6L   ), .Label = c("b", "c", "f", "h", "i", "o", "s", "t", "w", "x"   ), class = "factor"), var3 = c(7L, 5L, 5L, 8L, 5L, 8L, 6L, 7L,    5L, 8L), var4 = structure(c(8L, 5L, 1L, 4L, 7L, 4L, 3L, 6L, 9L,    2L), .Label = c("b", "c", "d", "e", "f", "h", "i", "w", "y"),    class = "factor")), .Names = c("var1", "var2", "var3", "var4"),    row.names = c(NA, -10L), class = "data.frame")  # this is the normal way to arrange df with dplyr df %>% arrange(var3, var4)  # but none of these (below) work for passing a vector of variables vector_of_vars <- c("var3", "var4") df %>% arrange(vector_of_vars) df %>% arrange(get(vector_of_vars)) df %>% arrange(eval(parse(text = paste(vector_of_vars, collapse = ", ")))) 
like image 992
rsoren Avatar asked Oct 21 '14 22:10

rsoren


People also ask

What is the use of Arrange () with dplyr package?

arrange() orders the rows of a data frame by the values of selected columns.

How do you rearrange the order of a column in a data set using dplyr functions?

The dplyr function arrange() can be used to reorder (or sort) rows by one or more variables. Instead of using the function desc(), you can prepend the sorting variable by a minus sign to indicate descending order, as follow. If the data contain missing values, they will always come at the end.

How do you arrange data in ascending order in R using arrangement?

To sort a data frame in R, use the order( ) function. By default, sorting is ASCENDING. Prepend the sorting variable by a minus sign to indicate DESCENDING order.

What does arrange function do in R?

The arrange() function in R programming is used to reorder the rows of a data frame/table by using column names. These columns are passed as the expression in the function.


1 Answers

Hadley hasn't made this obvious in the help file--only in his NSE vignette. The versions of the functions followed by underscores use standard evaluation, so you pass them vectors of strings and the like.

If I understand your problem correctly, you can just replace arrange() with arrange_() and it will work.

Specifically, pass the vector of strings as the .dots argument when you do it.

> df %>% arrange_(.dots=c("var1","var3"))    var1 var2 var3 var4 1     1    i    5    i 2     1    x    7    w 3     1    h    8    e 4     2    b    5    f 5     2    t    5    b 6     2    w    7    h 7     3    s    6    d 8     3    f    8    e 9     4    c    5    y 10    4    o    8    c 

========== Update March 2018 ==============

Using the standard evaluation versions in dplyr as I have shown here is now considered deprecated. You can read Hadley's programming vignette for the new way. Basically you will use !! to unquote one variable or !!! to unquote a vector of variables inside of arrange().

When you pass those columns, if they are bare, quote them using quo() for one variable or quos() for a vector. Don't use quotation marks. See the answer by Akrun.

If your columns are already strings, then make them names using rlang::sym() for a single column or rlang::syms() for a vector. See the answer by Christos. You can also use as.name() for a single column. Unfortunately as of this writing, the information on how to use rlang::sym() has not yet made it into the vignette I link to above (eventually it will be in the section on "variadic quasiquotation" according to his draft).

like image 125
farnsy Avatar answered Sep 29 '22 17:09

farnsy