r count cells with missing values across each row [duplicate]

Question

I have a dataframe as shown below

    Id         Date         Col1       Col2     Col3        Col4
    30         2012-03-31              A42.2    20.46        NA  
    36         1996-11-15   NA                  V73          55
    96         2010-02-07   X48        Z16      13
    40         2010-03-18   AD14                20.12        36
    69         2012-02-21              22.45                     
    11         2013-07-03   81         V017                  TCG11         
    22         2001-06-01                       67
    83         2005-03-16   80.45      V22.15   46.52        X29.11 
    92         2012-02-12   
    34         2014-03-10   82.12      N72.22   V45.44

I am trying to count the number of NA or Empty cells across each row and the final expected output is as follows

    Id         Date         Col1       Col2     Col3        Col4       MissCount
    30         2012-03-31              A42.2    20.46        NA        2
    36         1996-11-15   NA                  V73          55        2
    96         2010-02-07   X48        Z16      13                     1
    40         2010-03-18   AD14                20.12        36        1
    69         2012-02-21              22.45                           3
    11         2013-07-03   81         V017                  TCG11     1    
    22         2001-06-01                       67                     3
    83         2005-03-16   80.45      V22.15   46.52        X29.11    0
    92         2012-02-12                                              4   
    34         2014-03-10   82.12      N72.22   V45.44                 1

The last column MissCount will store the number of NAs or empty cells for each row. Any help is much appreciated.

Tim Biegeleisen · Accepted Answer

The one-liner

rowSums(is.na(df) | df == "")

given by @DavidArenburg in his comment is definitely the way to go, assuming that you don't mind checking every column in the data frame. If you really only want to check Col1 through Col4, then using an apply function might make more sense.

apply(df, 1, function(x) {
                sum(is.na(x[c("Col1", "Col2", "Col3", "Col4")])) +
                sum(x[c("Col1", "Col2", "Col3", "Col4")] == "", na.rm=TRUE)
             })

Edit: Shortened code

apply(df[c("Col1", "Col2", "Col3", "Col4")], 1, function(x) {
                    sum(is.na(x)) +
                    sum(x == "", na.rm=TRUE)
                 })

or if data columns are exactly like the example data:

apply(df[3:6], 1, function(x) {
                        sum(is.na(x)) +
                        sum(x == "", na.rm=TRUE)
                     })

r count cells with missing values across each row [duplicate]

Tags:

r

count

missing-data

Emily Fassbender

1 Answers

Tim Biegeleisen

Recent Activity

Donate For Us

r count cells with missing values across each row [duplicate]

Tags:

r

count

missing-data

Emily Fassbender

1 Answers

Tim Biegeleisen

Related questions

Recent Activity

Donate For Us