Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert NA into a factor level

Tags:

r

missing-data

I have a vector with NA values that I would like to replace by a new factor level NA.

a = as.factor(as.character(c(1, 1, 2, 2, 3, NA)))
a
[1] 1    1    2    2    3    <NA>
Levels: 1 2 3

This works, but it seems like a strange way to do it.

a = as.factor(ifelse(is.na(a), "NA", a))
class(a)
[1] "factor"

This is the expected output:

a
[1] 1  1  2  2  3  NA
Levels: 1 2 3 NA
like image 578
marbel Avatar asked Nov 28 '14 21:11

marbel


People also ask

How to add NA to factor in R?

The codes of a factor may contain NA . For a numeric x , set exclude=NULL to make NA an extra level ( "NA" ), by default the last level. If "NA" is a level, the way to set a code to be missing is to use is.na on the left-hand-side of an assignment.

What is a level of a factor?

Factor levels are all of the values that the factor can take (recall that a categorical variable has a set number of groups). In a designed experiment, the treatments represent each combination of factor levels.

How do you reorder factors in R?

Using factor() function to reorder factor levels is the simplest way to reorder the levels of the factors, as here the user needs to call the factor function with the factor level stored and the sequence of the new levels which is needed to replace from the previous factor levels as the functions parameters and this ...

What does as factor do in R?

The as. factor() is a built-in R function that converts a column from numeric to factor. The as. factor() method takes column or data frame x as an argument and returns the requested column specified as a factor rather than numeric.


3 Answers

You can use addNA().

x <- c(1, 1, 2, 2, 3, NA)
addNA(x)
# [1] 1    1    2    2    3    <NA>
# Levels: 1 2 3 <NA>

This is basically a convenience function for factoring with exclude = NULL. From help(factor) -

addNA modifies a factor by turning NA into an extra level (so that NA values are counted in tables, for instance).

So another reason this is nice is because if you already have a factor f, you can use addNA() to quickly add NA as a factor level without changing f. As mentioned in the documentation, this is handy for tables. It also reads nicely.

like image 156
Rich Scriven Avatar answered Oct 18 '22 21:10

Rich Scriven


You can add the NA as a level and change the level name to something more explicit than <NA> using fct_explicit_na from package forcats.

library(forcats)

By default you get the new level as (Missing):

fct_explicit_na(a)

[1] 1         1         2         2         3         (Missing)
Levels: 1 2 3 (Missing)

You can set it to something else:

fct_explicit_na(a, "unknown")

[1] 1       1       2       2       3       unknown
Levels: 1 2 3 unknown
like image 30
aosmith Avatar answered Oct 18 '22 20:10

aosmith


Set the exclude argument to NULL to include NAs as levels (and use factor instead of as.factor. Does the same thing and has more arguments to set):

a = factor(as.character(c(1, 1, 2, 2, 3, NA)), exclude = NULL)

> a
[1] 1    1    2    2    3    <NA>
Levels: 1 2 3 <NA>
like image 20
LyzandeR Avatar answered Oct 18 '22 21:10

LyzandeR