Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replace contents of factor column in R dataframe

Tags:

r

I need to replace the levels of a factor column in a dataframe. Using the iris dataset as an example, how would I replace any cells which contain virginica with setosa in the Species column?

I expected the following to work, but it generates a warning message and simply inserts NAs:

iris$Species[iris$Species == 'virginica'] <- 'setosa' 
like image 920
luciano Avatar asked Aug 04 '12 17:08

luciano


People also ask

How do I replace specific values in a column in R?

replace() function in R Language is used to replace the values in the specified string vector x with indices given in list by those given in values. It takes on three parameters first is the list name, then the index at which the element needs to be replaced, and the third parameter is the replacement values.

How do I replace values in multiple columns in R?

Use R dplyr::coalesce() to replace NA with 0 on multiple dataframe columns by column name and dplyr::mutate_at() method to replace by column name and index. tidyr:replace_na() to replace.

How do you reorder factors in R?

Using factor() function to reorder factor levels is the simplest way to reorder the levels of the factors, as here the user needs to call the factor function with the factor level stored and the sequence of the new levels which is needed to replace from the previous factor levels as the functions parameters and this ...


1 Answers

I bet the problem is when you are trying to replace values with a new one, one that is not currently part of the existing factor's levels:

levels(iris$Species) # [1] "setosa"     "versicolor" "virginica"  

Your example was bad, this works:

iris$Species[iris$Species == 'virginica'] <- 'setosa' 

This is what more likely creates the problem you were seeing with your own data:

iris$Species[iris$Species == 'virginica'] <- 'new.species' # Warning message: # In `[<-.factor`(`*tmp*`, iris$Species == "virginica", value = c(1L,  : #   invalid factor level, NAs generated 

It will work if you first increase your factor levels:

levels(iris$Species) <- c(levels(iris$Species), "new.species") iris$Species[iris$Species == 'virginica'] <- 'new.species' 

If you want to replace "species A" with "species B" you'd be better off with

levels(iris$Species)[match("oldspecies",levels(iris$Species))] <- "newspecies" 
like image 165
flodel Avatar answered Oct 01 '22 12:10

flodel