Bind rows of data frames with some factor columns

Question

I want to create a suped-up version of dplyr::bind_rows that avoids the Unequal factor levels: coercing to character warnings when factor columns are present in the dfs we're trying to combine (which may also have non-factor columns). Here's an example:

df1 <- dplyr::data_frame(age = 1:3, gender = factor(c("male", "female", "female")), district = factor(c("north", "south", "west")))
df2 <- dplyr::data_frame(age = 4:6, gender = factor(c("male", "neutral", "neutral")), district = factor(c("central", "north", "east")))

then bind_rows_with_factor_columns(df1, df2) returns (without warnings):

dplyr::data_frame(
  age = 1:6,
  gender = factor(c("male", "female", "female", "male", "neutral", "neutral")),
  district = factor(c("north", "south", "west", "central", "north", "east"))
)

Here's what I have so far:

bind_rows_with_factor_columns <- function(...) {
  factor_columns <- purrr::map(..., function(df) {
      colnames(dplyr::select_if(df, is.factor))
  })

  if (length(unique(factor_columns)) > 1) {
      stop("All factor columns in dfs must have the same column names")
  }

  df_list <- purrr::map(..., function (df) {
    purrr::map_if(df, is.factor, as.character) %>% dplyr::as_data_frame()
  })

  dplyr::bind_rows(df_list) %>%
    purrr::map_at(factor_columns[[1]], as.factor) %>%
    dplyr::as_data_frame()
}

I'm wondering if anyone has any ideas on how to incorporate the forcats package to potentially avoid having to coerce factors to characters, or if anyone has any suggestions in general to boost the performance of this while maintaining the same functionality (I'd like to stick to tidyverse syntax). Thanks!

Nick Resnick · Accepted Answer

Going to answer my own question based on a great solution from a friend:

bind_rows_with_factor_columns <- function(...) {
  purrr::pmap_df(list(...), function(...) {
    cols_to_bind <- list(...)
    if (all(purrr::map_lgl(cols_to_bind, is.factor))) {
      forcats::fct_c(cols_to_bind)
    } else {
      unlist(cols_to_bind)
    }
  })
}

Bind rows of data frames with some factor columns

Tags:

r

dplyr

purrr

tidyverse

Nick Resnick

1 Answers

Nick Resnick

Recent Activity

Donate For Us

Bind rows of data frames with some factor columns

Tags:

r

dplyr

purrr

tidyverse

Nick Resnick

1 Answers

Nick Resnick

Related questions

Recent Activity

Donate For Us