Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Best way to convert numeric variable to ordered factor

Tags:

r

I have a data frame that looks something like this:

df1 <- data.frame(V1=rnorm(n = 100, mean=0, sd=1),
                  Edu=sample(x = c(-999,12,13,14,16,1), size = 100, 
                             replace = T, prob = c(0.05,0.2,.2,0.2,0.2,0.15)))

I want to convert the variable Edu to an ordered factor variable. I can convert it to a character variable with this code:

lutedu <- c('-999' = NA, '12' = "High School", '13' = "Associate's", 
         '14' = "Associate's", '16' = "Bachelor's", 
         '18' = "Master's, Graduate/professional", '21' = "PhD")

df1$Edu <- lutedu[as.character(df1$Edu)]

and from there I could convert the character variable to an ordered factor with ordered():

df1$Edu <-
  ordered(
    x = df1$Edu, levels = c(
      "High School", "Associate's", "Bachelor's",
      "Master's, Graduate/professional", "PhD"
    )
  )

Is there a better way of doing this?

like image 871
Ignacio Avatar asked Sep 16 '25 19:09

Ignacio


1 Answers

Instead of recoding with a named vector and then calling ordered, you can save yourself a step by calling ordered and using both the levels and the labels arguments:

ordered(edu, levels=c(-999, 12, 13, 14, 16, 1),
        labels=c("NA", "High School", "Associate's", "Bachelor's",
                 "Master's/Graduate", "PhD"))
#   [1] High School       Master's/Graduate Master's/Graduate Bachelor's        Associate's      
#   [6] Master's/Graduate High School       Master's/Graduate High School       PhD              
# ...

Data:

set.seed(144)
edu <- sample(x = c(-999,12,13,14,16,1), size = 100, 
              replace = T, prob = c(0.05,0.2,.2,0.2,0.2,0.15))
like image 167
josliber Avatar answered Sep 19 '25 09:09

josliber