I have a question on factors in R. I want to replace the values of a factor with the next higher factor levels. Here's an example:
Suppose I have the factor have:
set.seed(1)
have <- sample(1:20, 10, TRUE)
have
# [1] 4 7 1 2 11 14 18 19 1 10
What I would like to get is this
[1] 7 10 2 4 14 18 19 <NA> 2 11
hence, each value is replaced with the next highest factor value / level (4 becomes 7, 7 becomes 10 etc), and the highest value is replaced with a NA.
One way to achieve this would be
want <- factor(have)
levels(want) <- c(levels(want)[-1], NA)
want
# [1] 7 10 2 4 14 18 19 <NA> 2 11
# Levels: 2 4 7 10 11 14 18 19
Is there another way to do this?
I have received three very good answers that I'll try to summarize here:
func_lookup <- function(x){
lu <- sort(unique(x))
lu <- "[<-"(NA, lu, c(lu[-1], NA))
lu[x]
}
func_dplyr <- function(x){
levels(x) <- dplyr::lead(levels(x))
x
}
func_base <- function(x){
vals <- sort(unique(x))
vals[match(x, vals) + 1]
}
As can be seen from the examples, func_lookup only works for vectors, while func_dplyr only works for factors. func_base works with both factors and vectors.
# Example 1
set.seed(1)
# create sample data
have <- c(4, 6, 6, 7)
# create sample data as factor
have_f <- factor(have)
# test functions for factor
have_f
func_lookup(have_f)
func_dplyr(have_f)
func_base(have_f)
#> have_f
#[1] 4 6 6 7
#Levels: 4 6 7
#> func_lookup(have_f)
#[1] 2 3 3 NA
#> func_dplyr(have_f)
#[1] 6 7 7 <NA>
#Levels: 6 7
#> func_base(have_f)
#[1] 6 7 7 <NA>
#Levels: 4 6 7
# for vectors
func_lookup(have)
func_base(have)
> func_lookup(have)
[1] 6 7 7 NA
> #func_dplyr(have)
> func_base(have)
[1] 6 7 7 NA
sort
the unique
values of have
, use match
to get their index position, add + 1 to get the next value and subset it.
vals <- sort(unique(have))
vals[match(have, vals) + 1]
#[1] 7 10 2 4 14 18 19 NA 2 11
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With