Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Function to change blanks to NA

I'm trying to write a function that turns empty strings into NA. A summary of one of my column looks like this:

      a   b 
 12 210 468 

I'd like to change the 12 empty values to NA. I also have a few other factor columns for which I'd like to change empty values to NA, so I borrowed some stuff from here and there to come up with this:

# change nulls to NAs
nullToNA <- function(df){

  # split df into numeric & non-numeric functions
  a<-df[,sapply(df, is.numeric), drop = FALSE]
  b<-df[,sapply(df, Negate(is.numeric)), drop = FALSE]

  # Change empty strings to NA
  b<-b[lapply(b,function(x) levels(x) <- c(levels(x), NA) ),] # add NA level
  b<-b[lapply(b,function(x) x[x=="",]<- NA),]                 # change Null to NA

  # Put the columns back together
  d<-cbind(a,b)
  d[, names(df)]
}

However, I'm getting this error:

> foo<-nullToNA(bar)  
Error in x[x == "", ] <- NA : incorrect number of subscripts on matrix  
Called from: FUN(X[[i]], ...)

I have tried the answer found here: Replace all 0 values to NA but it changes all my columns to numeric values.

like image 653
Travis Heeter Avatar asked Nov 02 '16 11:11

Travis Heeter


People also ask

How do I replace blank with Na?

Replace Empty String with NA in an R DataframeUse df[df==”] to check if the value of a data frame column is an empty string, if it is an empty string you can assign the value NA . The below example replaces all blank string values on all columns with NA.

How do you add Na to blank cells in Excel?

You must include the empty parentheses with the function name. Otherwise, Microsoft Excel will not recognize it as a function. You can also type the value #N/A directly into a cell. The NA function is provided for compatibility with other spreadsheet programs.

What does it mean when the output of the IS NA () function is empty?

N/A means “no value available” or “not available.” As a financial analyst, the NA function can be used to mark empty cells and thus avoid the inclusion of empty cells in the calculation.

How do I replace Na in R?

You can replace NA values with blank space on columns of R dataframe (data. frame) by using is.na() , replace() methods. And use dplyr::mutate_if() to replace only on character columns when you have mixed numeric and character columns, use dplyr::mutate_at() to replace on multiple selected columns by index and name.


3 Answers

You can directly index fields that match a logical criterion. So you can just write:

df[is_empty(df)] = NA

Where is_empty is your comparison, e.g. df == "":

df[df == ""] = NA

But note that is.null(df) won’t work, and would be weird anyway1. I would advise against merging the logic for columns of different types, though! Instead, handle them separately.


1 You’ll almost never encounter NULL inside a table since that only works if the underlying vector is a list. You can create matrices and data.frames with this constraint, but then is.null(df) will never be TRUE because the NULL values are wrapped inside the list).

like image 178
Konrad Rudolph Avatar answered Oct 19 '22 12:10

Konrad Rudolph


How about just:

df[apply(df, 2, function(x) x=="")] = NA

Works fine for me, at least on simple examples.

like image 38
juod Avatar answered Oct 19 '22 12:10

juod


This worked for me

    df[df == 'NULL'] <- NA
like image 31
AMS Avatar answered Oct 19 '22 12:10

AMS