Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Converting multiple columns to factors and releveling with mutate(across)

Tags:

dataframe

r

dplyr

dat <- data.frame(Comp1Letter = c("A", "B", "D", "F", "U", "A*", "B", "C"),
                   Comp2Letter = c("B", "C", "E", "U", "A", "C", "A*", "E"),
                   Comp3Letter = c("D", "A", "C", "D", "F", "D", "C", "A"))  

GradeLevels <- c("A*", "A", "B", "C", "D", "E", "F", "G", "U")

I have a dataframe that looks something like the above (but with many other columns I don't want to change).

The columns I am interested in changing contains lists of letter grades, but are currently character vectors and not in the right order.

I need to convert each of these columns into factors with the correct order. I've been able to get this to work using the code below:

factordat <-
    dat %>%
      mutate(Comp1Letter = factor(Comp1Letter, levels = GradeLevels)) %>%
      mutate(Comp2Letter = factor(Comp2Letter, levels = GradeLevels)) %>%
      mutate(Comp3Letter = factor(Comp3Letter, levels = GradeLevels)) 

However this is super verbose and chews up a lot of space.

Looking at some other questions, I've tried to use a combination of mutate() and across(), as seen below:

factordat <-
  dat %>%
    mutate(across(c(Comp1Letter, Comp2Letter, Comp3Letter) , factor(levels = GradeLetters))) 

However when I do this the vectors remain character vectors.

Could someone please tell me what I'm doing wrong or offer another option?

like image 604
Alan Nielsen Avatar asked Oct 27 '25 08:10

Alan Nielsen


1 Answers

You can do across as an anonymous function like this:

dat <- data.frame(Comp1Letter = c("A", "B", "D", "F", "U", "A*", "B", "C"),
                   Comp2Letter = c("B", "C", "E", "U", "A", "C", "A*", "E"),
                   Comp3Letter = c("D", "A", "C", "D", "F", "D", "C", "A"))  

GradeLevels <- c("A*", "A", "B", "C", "D", "E", "F", "G", "U")

dat %>%
  tibble::as_tibble() %>%
    dplyr::mutate(dplyr::across(c(Comp1Letter, Comp2Letter, Comp3Letter) , ~forcats::parse_factor(., levels = GradeLevels)))

# # A tibble: 8 × 3
#   Comp1Letter Comp2Letter Comp3Letter
#   <fct>       <fct>       <fct>      
# 1 A           B           D          
# 2 B           C           A          
# 3 D           E           C          
# 4 F           U           D          
# 5 U           A           F          
# 6 A*          C           D          
# 7 B           A*          C          
# 8 C           E           A     

You were close, all that was left to be done was make the factor function anonymous. That can be done either with ~ and . in tidyverse or function(x) and x in base R.

like image 73
dcsuka Avatar answered Oct 28 '25 21:10

dcsuka



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!