I have a dataframe as shown below
Id Date Col1 Col2 Col3 Col4
30 2012-03-31 A42.2 20.46 NA
36 1996-11-15 NA V73 55
96 2010-02-07 X48 Z16 13
40 2010-03-18 AD14 20.12 36
69 2012-02-21 22.45
11 2013-07-03 81 V017 TCG11
22 2001-06-01 67
83 2005-03-16 80.45 V22.15 46.52 X29.11
92 2012-02-12
34 2014-03-10 82.12 N72.22 V45.44
I am trying to count the number of NA or Empty cells across each row and the final expected output is as follows
Id Date Col1 Col2 Col3 Col4 MissCount
30 2012-03-31 A42.2 20.46 NA 2
36 1996-11-15 NA V73 55 2
96 2010-02-07 X48 Z16 13 1
40 2010-03-18 AD14 20.12 36 1
69 2012-02-21 22.45 3
11 2013-07-03 81 V017 TCG11 1
22 2001-06-01 67 3
83 2005-03-16 80.45 V22.15 46.52 X29.11 0
92 2012-02-12 4
34 2014-03-10 82.12 N72.22 V45.44 1
The last column MissCount will store the number of NAs or empty cells for each row. Any help is much appreciated.
The one-liner
rowSums(is.na(df) | df == "")
given by @DavidArenburg in his comment is definitely the way to go, assuming that you don't mind checking every column in the data frame. If you really only want to check Col1 through Col4, then using an apply function might make more sense.
apply(df, 1, function(x) {
sum(is.na(x[c("Col1", "Col2", "Col3", "Col4")])) +
sum(x[c("Col1", "Col2", "Col3", "Col4")] == "", na.rm=TRUE)
})
Edit: Shortened code
apply(df[c("Col1", "Col2", "Col3", "Col4")], 1, function(x) {
sum(is.na(x)) +
sum(x == "", na.rm=TRUE)
})
or if data columns are exactly like the example data:
apply(df[3:6], 1, function(x) {
sum(is.na(x)) +
sum(x == "", na.rm=TRUE)
})
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With