Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

r - check if every column is na

Tags:

r

na

I have a list of columns within a dataframe which where i want to check if all those columns are NA and create a new column which tells me if they are NA or not.

Here is an example of it working with one column, where Any_Flag is my new column:

ItemStats_2014$Any_Flag <- ifelse(is.na(ItemStats_2014$Item_Flag_A), "Y", "N")

When i try to run the check over multiple columns, I am getting what i expect:

ItemStats_2014$Any_Flag <- ifelse(all(is.na(ItemStats_2014[ ,grep("Flag", names(ItemStats_2014), value = T)])), "Y", "N")

It returns everything to be false or "N".

like image 434
alexb523 Avatar asked Apr 06 '18 18:04

alexb523


People also ask

How do you check if a column is all NA in R?

To check which value in NA in an R data frame, we can use apply function along with is.na function. This will return the data frame in logical form with TRUE and FALSE.

How do you check if there is na in R?

To test if a value is NA, use is.na(). The function is.na(x) returns a logical vector of the same size as x with value TRUE if and only if the corresponding element in x is NA.

How do you check if there is na in a column?

You can test for both by wrapping them with the function any . So any(is.na(x)) will return TRUE if any of the values of the object are NA . And any(is. infinite(x)) will return the same for -Inf or Inf .

How do I find missing values in a column in R?

Generally, missing values in the given data is represented with NA. In R programming, the missing values can be determined by is.na() method. This method accepts the data variable as a parameter and determines whether the data point is a missing value or not.


2 Answers

Data

set.seed(1)
data <- c(LETTERS, NA)
df <- data.frame(Flag_A = sample(data), Flag_B = sample(data), 
                 C = sample(data), D = sample(data), Flag_E = sample(data))

df <- rbind(NA, df)

Code

Identifying all NAs per row:

> df$All_NA <- apply(df[, grep("Flag", names(df))], 1, function(x) all(is.na(x)))
> head(df)
  Flag_A Flag_B    C    D Flag_E All_NA
1   <NA>   <NA> <NA> <NA>   <NA>   TRUE
2      H      K    B    T      Y  FALSE
3      J      W    C    K      P  FALSE
4      O      I    H    I   <NA>  FALSE
5      V      L    M    S      R  FALSE
6      E      N    P    E      I  FALSE

Identifying at least one NA per row:

> df$Any_NA <- apply(df[, grep("Flag", names(df))], 1, function(x) anyNA(x))
> head(df)
  Flag_A Flag_B    C    D Flag_E Any_NA
1   <NA>   <NA> <NA> <NA>   <NA>   TRUE
2      H      K    B    T      Y  FALSE
3      J      W    C    K      P  FALSE
4      O      I    H    I   <NA>   TRUE
5      V      L    M    S      R  FALSE
6      E      N    P    E      I  FALSE
like image 91
Cainã Max Couto-Silva Avatar answered Sep 21 '22 22:09

Cainã Max Couto-Silva


And a data.table way without any apply is:

library(arsenal)
library(data.table)

# dummy data
set.seed(1)
data = c(LETTERS, NA)
dt = data.table(Flag_A=sample(data), Flag_B = sample(data), C=sample(data), D=sample(data), Flag_E=sample(data))
dt = rbind(NA, dt)

# All-NA/Any-NA check
columns_to_check = names(dt)[grep('Flag', names(dt))]
dt[, AllNA:=allNA(.SD), by=1:nrow(dt), .SDcols = columns_to_check]
dt[, AnyNA:=anyNA(.SD), by=1:nrow(dt), .SDcols = columns_to_check]
like image 43
Ufos Avatar answered Sep 25 '22 22:09

Ufos