I have a list of columns within a dataframe which where i want to check if all those columns are NA
and create a new column which tells me if they are NA
or not.
Here is an example of it working with one column, where Any_Flag
is my new column:
ItemStats_2014$Any_Flag <- ifelse(is.na(ItemStats_2014$Item_Flag_A), "Y", "N")
When i try to run the check over multiple columns, I am getting what i expect:
ItemStats_2014$Any_Flag <- ifelse(all(is.na(ItemStats_2014[ ,grep("Flag", names(ItemStats_2014), value = T)])), "Y", "N")
It returns everything to be false or "N".
To check which value in NA in an R data frame, we can use apply function along with is.na function. This will return the data frame in logical form with TRUE and FALSE.
To test if a value is NA, use is.na(). The function is.na(x) returns a logical vector of the same size as x with value TRUE if and only if the corresponding element in x is NA.
You can test for both by wrapping them with the function any . So any(is.na(x)) will return TRUE if any of the values of the object are NA . And any(is. infinite(x)) will return the same for -Inf or Inf .
Generally, missing values in the given data is represented with NA. In R programming, the missing values can be determined by is.na() method. This method accepts the data variable as a parameter and determines whether the data point is a missing value or not.
Data
set.seed(1)
data <- c(LETTERS, NA)
df <- data.frame(Flag_A = sample(data), Flag_B = sample(data),
C = sample(data), D = sample(data), Flag_E = sample(data))
df <- rbind(NA, df)
Code
Identifying all NAs per row:
> df$All_NA <- apply(df[, grep("Flag", names(df))], 1, function(x) all(is.na(x)))
> head(df)
Flag_A Flag_B C D Flag_E All_NA
1 <NA> <NA> <NA> <NA> <NA> TRUE
2 H K B T Y FALSE
3 J W C K P FALSE
4 O I H I <NA> FALSE
5 V L M S R FALSE
6 E N P E I FALSE
Identifying at least one NA per row:
> df$Any_NA <- apply(df[, grep("Flag", names(df))], 1, function(x) anyNA(x))
> head(df)
Flag_A Flag_B C D Flag_E Any_NA
1 <NA> <NA> <NA> <NA> <NA> TRUE
2 H K B T Y FALSE
3 J W C K P FALSE
4 O I H I <NA> TRUE
5 V L M S R FALSE
6 E N P E I FALSE
And a data.table
way without any apply
is:
library(arsenal)
library(data.table)
# dummy data
set.seed(1)
data = c(LETTERS, NA)
dt = data.table(Flag_A=sample(data), Flag_B = sample(data), C=sample(data), D=sample(data), Flag_E=sample(data))
dt = rbind(NA, dt)
# All-NA/Any-NA check
columns_to_check = names(dt)[grep('Flag', names(dt))]
dt[, AllNA:=allNA(.SD), by=1:nrow(dt), .SDcols = columns_to_check]
dt[, AnyNA:=anyNA(.SD), by=1:nrow(dt), .SDcols = columns_to_check]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With