Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Arrange by custom order in dplyr using dynamic column names

Tags:

r

dplyr

I am trying to arrange a tibble based on a certain column (but setting the column based on a name dynamically set by a variable).

Below is the code I currently am trying but am getting an error. The second paste of code works (where I have hard coded in the column name of symbol which I want to be set based off of a variable instead).

library(tidyverse)

group_var <- "symbol"

date_seq <- seq(as.Date("2000-01-01"), as.Date("2009-12-31"), by = "days")
test_tbl <- tibble::tibble("date" = rep(date_seq, 3),
                           "symbol" = rep(c("test3", "test1", "test2"), each = length(date_seq)),
                           "value" = c(rnorm(length(date_seq), sd = 0.05),
                                       rnorm(length(date_seq), sd = 0.05),
                                       rnorm(length(date_seq), sd = 0.05)))

order_var <- c("test1", "test2", "test3")
test_tbl_final <- test_tbl %>%
  dplyr::arrange(factor(!!group_var, levels = order_var), date)

Below is the code that works and shows what I am trying to accomplish:

library(tidyverse)

date_seq <- seq(as.Date("2000-01-01"), as.Date("2009-12-31"), by = "days")
test_tbl <- tibble::tibble("date" = rep(date_seq, 3),
                           "symbol" = rep(c("test3", "test1", "test2"), each = length(date_seq)),
                           "value" = c(rnorm(length(date_seq), sd = 0.05),
                                       rnorm(length(date_seq), sd = 0.05),
                                       rnorm(length(date_seq), sd = 0.05)))

order_var <- c("test1", "test2", "test3")
test_tbl_final <- test_tbl %>%
  dplyr::arrange(factor(symbol, levels = order_var), date)
like image 303
Trevor Nederlof Avatar asked Feb 27 '18 18:02

Trevor Nederlof


People also ask

How to rearrange or reorder the columns in R using dplyr?

Dplyr package in R is provided with select () function which reorders the columns. In order to Rearrange or Reorder the rows of the dataframe in R using Dplyr we use arrange () funtion. The arrange () function is used to rearrange rows in ascending or descending order. Moving a column to First position or Last Position in R can also accomplished.

What is the use of arrange in dplyr?

arrange.Rd arrange() orders the rows of a data frame by the values of selected columns. Unlike other dplyr verbs, arrange() largely ignores grouping; you need to explicitly mention grouping variables (or use .by_group = TRUE ) in order to group by them, and functions of variables are evaluated once per data frame, not once per group.

How to sort by more than one variable in dplyr?

With dplyr’s arrange () function we can sort by more than one variable. To sort or arrange by two variables, we specify the names of two variables as arguments to arrange () function as shown below. Note that the order matters here.

How to sort Dataframe by variable in Python using dplyr?

We will use pipe operator “%>%” to feed the data to the dplyr function arrange (). We need to specify name of the variable that we want to sort dataframe. In this example, we are sorting by variable “body_mass_g”. dplyr’s arrange () sorts the dataframe by the variable and outputs a new dataframe (as a tibble).


2 Answers

You can also use as.symbol from base R

test_tbl_final <- test_tbl %>%
  dplyr::arrange(factor(!!as.symbol(group_var), levels = order_var), date)
like image 197
IceCreamToucan Avatar answered Oct 02 '22 16:10

IceCreamToucan


You need rlang:sym to convert group_var from character symbol to a symbol symbol and then use !! to evaluate the symbol as a column object:

test_tbl %>% 
    arrange(factor(!!rlang::sym(group_var), levels = order_var), date)

# A tibble: 10,959 x 3
#         date symbol         value
#       <date>  <chr>         <dbl>
# 1 2000-01-01  test1  0.0519143671
# 2 2000-01-02  test1 -0.0464782439
# 3 2000-01-03  test1 -0.0295441613
# ...
like image 21
Psidom Avatar answered Oct 02 '22 17:10

Psidom