I have the following data frame: <pre class="prettyprint"><code>df1 <- data.frame(id = 1:20, fact1 = factor(rep(c('abc','def','NA',''),5))) df1 id fact1 1 1 abc 2 2 def 3 3 NA 4 4 5 5 abc 6 6 def 7 7 NA 8 8 9 9 abc 10 10 def 11 11 NA 12 12 13 13 abc 14 14 def 15 15 NA 16 16 17 17 abc 18 18 def 19 19 NA 20 20 </code></pre> I'm trying to standardize all the missing values ('' and NA's) to become NA's. However when I use this: <pre class="prettyprint"><code>df1[df1 == ''] <- NA </code></pre> there seems to be 2 classes of NA's. <pre class="prettyprint"><code>df1 id fact1 1 1 abc 2 2 def 3 3 NA 4 4 <NA> 5 5 abc 6 6 def 7 7 NA 8 8 <NA> 9 9 abc 10 10 def 11 11 NA 12 12 <NA> 13 13 abc 14 14 def 15 15 NA 16 16 <NA> 17 17 abc 18 18 def 19 19 NA 20 20 <NA> </code></pre> Is there a best-practices method for dealing with this situation?

Expanding on joran's comment: <pre class="prettyprint"><code>df1 <- data.frame(id = 1:5, fact1 = factor(c('abc','def', NA, 'NA',''))) > df1 id fact1 1 1 abc 2 2 def 3 3 <NA> 4 4 NA 5 5 df1[df1 == '' | df1 == 'NA'] <- NA > df1 id fact1 1 1 abc 2 2 def 3 3 <NA> 4 4 <NA> 5 5 <NA> </code></pre>

R factor NA vs <NA>

Tags:

r

missing-data

na

I have the following data frame:

Click to copy

df1 <- data.frame(id = 1:20, fact1 = factor(rep(c('abc','def','NA',''),5)))
df1
   id fact1
1   1   abc
2   2   def
3   3    NA
4   4      
5   5   abc
6   6   def
7   7    NA
8   8      
9   9   abc
10 10   def
11 11    NA
12 12      
13 13   abc
14 14   def
15 15    NA
16 16      
17 17   abc
18 18   def
19 19    NA
20 20

I'm trying to standardize all the missing values ('' and NA's) to become NA's. However when I use this:

Click to copy

df1[df1 == ''] <- NA

there seems to be 2 classes of NA's.

Click to copy

df1
   id fact1
1   1   abc
2   2   def
3   3    NA
4   4  <NA>
5   5   abc
6   6   def
7   7    NA
8   8  <NA>
9   9   abc
10 10   def
11 11    NA
12 12  <NA>
13 13   abc
14 14   def
15 15    NA
16 16  <NA>
17 17   abc
18 18   def
19 19    NA
20 20  <NA>

Is there a best-practices method for dealing with this situation?

797

asked Jun 14 '13 19:06

screechOwl

1 Answers

Expanding on joran's comment:

Click to copy

df1 <- data.frame(id = 1:5, fact1 = factor(c('abc','def', NA, 'NA','')))
> df1
  id fact1
1  1   abc
2  2   def
3  3  <NA>
4  4    NA
5  5      

df1[df1 == '' | df1 == 'NA'] <- NA
> df1
  id fact1
1  1   abc
2  2   def
3  3  <NA>
4  4  <NA>
5  5  <NA>

answered Sep 25 '22 02:09

Zach

Related questions
                            
                                Cache or pre render leaflet map in shiny app
                            
                                Keyboard shortcut to split screen in half with source pane on left and console pane
                            
                                Disable GUI, graphics devices in R
                            
                                merging two dataframes in R
                            
                                How to set g++ compiler flags using Rcpp and inline?
                            
                                Put the Y axis on the left of a heatmap?
                            
                                Cannot view gvisMotionChart from printed html file
                            
                                Emacs, R, Org-mode: how to enable automatic switch to ESS-mode within R code blocks?
                            
                                How to use princomp () function in R when covariance matrix has zero's?
                            
                                ggplot2: stat_smooth for logistic outcomes with facet_wrap returning 'full' or 'subset' glm models
                            
                                How do I add an asterix to a boxplot to represent significance?
                            
                                ggplot2 return values
                            
                                Recommend a scale colour for 13 or more categories
                            
                                make panels with same margins when combining ggplot and base graphics
                            
                                How to fill in an online form and get results back in R
                            
                                Difficulties with simple ggplot histogram
                            
                                Date sequence in R spanning B.C.E. to A.D
                            
                                converting a data frame to monthly time series
                            
                                How can I make cell size in an heatmap mediate data resolution using R?
                            
                                Trying to get started with doParallel and foreach but no improvement

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

R factor NA vs <NA>

Tags:

r

missing-data

na

screechOwl

People also ask

1 Answers

Zach

Recent Activity

Donate For Us