Here's a little piece of code I wrote to report variables with missing values from a data frame. I'm trying to think of a more elegant way to do this, one that perhaps returns a data.frame, but I'm stuck: <pre class="prettyprint"><code>for (Var in names(airquality)) { missing <- sum(is.na(airquality[,Var])) if (missing > 0) { print(c(Var,missing)) } } </code></pre> Edit: I'm dealing with data.frames with dozens to hundreds of variables, so it's key that we only report variables with missing values.

Just use <code>sapply</code> <pre class="prettyprint"><code>> sapply(airquality, function(x) sum(is.na(x))) Ozone Solar.R Wind Temp Month Day 37 7 0 0 0 0 </code></pre> You could also use <code>apply</code> or <code>colSums</code> on the matrix created by <code>is.na()</code> <pre class="prettyprint"><code>> apply(is.na(airquality),2,sum) Ozone Solar.R Wind Temp Month Day 37 7 0 0 0 0 > colSums(is.na(airquality)) Ozone Solar.R Wind Temp Month Day 37 7 0 0 0 0 </code></pre>

Elegant way to report missing values in a data.frame

Tags:

dataframe

r

missing-data

Here's a little piece of code I wrote to report variables with missing values from a data frame. I'm trying to think of a more elegant way to do this, one that perhaps returns a data.frame, but I'm stuck:

for (Var in names(airquality)) {     missing <- sum(is.na(airquality[,Var]))     if (missing > 0) {         print(c(Var,missing))     } }

Edit: I'm dealing with data.frames with dozens to hundreds of variables, so it's key that we only report variables with missing values.

496

asked Nov 29 '11 20:11

Zach

1 Answers

Just use sapply

> sapply(airquality, function(x) sum(is.na(x)))   Ozone Solar.R    Wind    Temp   Month     Day       37       7       0       0       0       0

You could also use apply or colSums on the matrix created by is.na()

> apply(is.na(airquality),2,sum)   Ozone Solar.R    Wind    Temp   Month     Day       37       7       0       0       0       0 > colSums(is.na(airquality))   Ozone Solar.R    Wind    Temp   Month     Day       37       7       0       0       0       0

answered Oct 08 '22 04:10

Joshua Ulrich

Related questions
                            
                                python equivalent of R table
                            
                                Remove multiple columns from data.table
                            
                                Convert a data frame to a data.table without copy
                            
                                How to sum data.frame column values?
                            
                                Elegant indexing up to end of vector/matrix
                            
                                How to see the source code of R .Internal or .Primitive function?
                            
                                What does the dot mean in R – personal preference, naming convention or more?
                            
                                What are examples of when seq_along works, but seq produces unintended results?
                            
                                How to get week numbers from dates?
                            
                                Pasting two vectors with combinations of all vectors' elements
                            
                                Calculate the mean by group
                            
                                Merging a lot of data.frames [duplicate]
                            
                                Find duplicate values in R [duplicate]
                            
                                Get coefficients estimated by maximum likelihood into a stargazer table
                            
                                Changing line colors with ggplot()
                            
                                Construct a manual legend for a complicated plot
                            
                                R: what are Slots?
                            
                                Reusing a Model Built in R
                            
                                R Shiny: reactiveValues vs reactive
                            
                                Removing display of row names from data frame

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With