I'd like to write some code that would take a given data frame, check to see if any columns are missing, and if so, add the missing columns filled with 0 or NA. Here's what I've got:
> df
x1 x2 x4
1 0 1 3
2 3 1 3
3 1 2 1
> nameslist <- c("x1","x2","x3","x4")
> miss.names <- !nameslist %in% colnames(df)
> holder <- rbind(nameslist,miss.names)
> miss.cols <- subset(holder[1,], holder[2,] == "TRUE")
Beyond this point, I can't figure out how to add in the missing column ("x3") without hardcoding it. Ideally, I'd want the new, complete data frame to have columns in the same order as nameslist as well.
Any ideas? My current code can be ignored, no problem.
In order to find the missing values in all columns use apply function with the which and the sum function in is.na() method.
In R, the easiest way to find columns that contain missing values is by combining the power of the functions is.na() and colSums(). First, you check and count the number of NA's per column. Then, you use a function such as names() or colnames() to return the names of the columns with at least one missing value.
The classic way to replace NA's in R is by using the IS.NA() function. The IS.NA() function takes a vector or data frame as input and returns a logical object that indicates whether a value is missing (TRUE or VALUE). Next, you can use this logical object to create a subset of the missing values and assign them a zero.
Here's a straightforward approach
df <- data.frame(a=1:4, e=4:1)
nms <- c("a", "b", "d", "e") # Vector of columns you want in this data.frame
Missing <- setdiff(nms, names(df)) # Find names of missing columns
df[Missing] <- 0 # Add them, filled with '0's
df <- df[nms] # Put columns in desired order
# a b d e
# 1 1 0 0 4
# 2 2 0 0 3
# 3 3 0 0 2
# 4 4 0 0 1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With