I create a dataframe df.
df <- data.frame (id = 1:10,
var1 = 10:19,
var2 = sample(c(1:2,NA), 10, replace=T),
var3 = sample(c(3:5, NA), 10, replace=T))
What I need is a new column var4, which count the number of non-NA values of each row (excluding the id column). So for example, if a row is like var1=19, var2=1, var3=NA, then var4=2. I could not find a good way to do this in dplyr. something like:
df %in% mutate(var4= ... )
I appreciate if anyone can help me with that.
Counting NA s across either rows or columns can be achieved by using the apply() function. This function takes three arguments: X is the input matrix, MARGIN is an integer, and FUN is the function to apply to each row or column. MARGIN = 1 means to apply the function across rows and MARGIN = 2 across columns.
Which aggregate function counts the number of non NA values in the group? The SAS function N calculates the number of non-blank numeric values across multiple columns.
R provides us nrow() function to get the rows for an object. That is, with nrow() function, we can easily detect and extract the number of rows present in an object that can be matrix, data frame or even a dataset.
Use select
+ is.na
+ rowSums
, select(., -id)
returns the original data frame (.
) with id
excluded, and then count number of non-NA values with rowSums(!is.na(...))
:
df %>% mutate(var4 = rowSums(!is.na(select(., -id))))
# id var1 var2 var3 var4
#1 1 10 NA 4 2
#2 2 11 1 NA 2
#3 3 12 2 5 3
#4 4 13 2 NA 2
#5 5 14 1 NA 2
#6 6 15 1 NA 2
#7 7 16 1 5 3
#8 8 17 NA 4 2
#9 9 18 NA 4 2
#10 10 19 NA NA 1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With