I'm trying to write a function that turns empty strings into NA. A summary of one of my column looks like this:
a b
12 210 468
I'd like to change the 12 empty values to NA. I also have a few other factor columns for which I'd like to change empty values to NA, so I borrowed some stuff from here and there to come up with this:
# change nulls to NAs
nullToNA <- function(df){
# split df into numeric & non-numeric functions
a<-df[,sapply(df, is.numeric), drop = FALSE]
b<-df[,sapply(df, Negate(is.numeric)), drop = FALSE]
# Change empty strings to NA
b<-b[lapply(b,function(x) levels(x) <- c(levels(x), NA) ),] # add NA level
b<-b[lapply(b,function(x) x[x=="",]<- NA),] # change Null to NA
# Put the columns back together
d<-cbind(a,b)
d[, names(df)]
}
However, I'm getting this error:
> foo<-nullToNA(bar) Error in x[x == "", ] <- NA : incorrect number of subscripts on matrix Called from: FUN(X[[i]], ...)
I have tried the answer found here: Replace all 0 values to NA but it changes all my columns to numeric values.
Replace Empty String with NA in an R DataframeUse df[df==”] to check if the value of a data frame column is an empty string, if it is an empty string you can assign the value NA . The below example replaces all blank string values on all columns with NA.
You must include the empty parentheses with the function name. Otherwise, Microsoft Excel will not recognize it as a function. You can also type the value #N/A directly into a cell. The NA function is provided for compatibility with other spreadsheet programs.
N/A means “no value available” or “not available.” As a financial analyst, the NA function can be used to mark empty cells and thus avoid the inclusion of empty cells in the calculation.
You can replace NA values with blank space on columns of R dataframe (data. frame) by using is.na() , replace() methods. And use dplyr::mutate_if() to replace only on character columns when you have mixed numeric and character columns, use dplyr::mutate_at() to replace on multiple selected columns by index and name.
You can directly index fields that match a logical criterion. So you can just write:
df[is_empty(df)] = NA
Where is_empty
is your comparison, e.g. df == ""
:
df[df == ""] = NA
But note that is.null(df)
won’t work, and would be weird anyway1. I would advise against merging the logic for columns of different types, though! Instead, handle them separately.
1 You’ll almost never encounter NULL
inside a table since that only works if the underlying vector is a list
. You can create matrices and data.frames with this constraint, but then is.null(df)
will never be TRUE
because the NULL
values are wrapped inside the list).
How about just:
df[apply(df, 2, function(x) x=="")] = NA
Works fine for me, at least on simple examples.
This worked for me
df[df == 'NULL'] <- NA
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With