I'd like to write some code that would take a given data frame, check to see if any columns are missing, and if so, add the missing columns filled with 0 or NA. Here's what I've got: <pre class="prettyprint"><code>> df x1 x2 x4 1 0 1 3 2 3 1 3 3 1 2 1 > nameslist <- c("x1","x2","x3","x4") > miss.names <- !nameslist %in% colnames(df) > holder <- rbind(nameslist,miss.names) > miss.cols <- subset(holder[1,], holder[2,] == "TRUE") </code></pre> Beyond this point, I can't figure out how to add in the missing column ("x3") without hardcoding it. Ideally, I'd want the new, complete data frame to have columns in the same order as nameslist as well. Any ideas? My current code can be ignored, no problem.

Here's a straightforward approach <pre class="prettyprint"><code>df <- data.frame(a=1:4, e=4:1) nms <- c("a", "b", "d", "e") # Vector of columns you want in this data.frame Missing <- setdiff(nms, names(df)) # Find names of missing columns df[Missing] <- 0 # Add them, filled with '0's df <- df[nms] # Put columns in desired order # a b d e # 1 1 0 0 4 # 2 2 0 0 3 # 3 3 0 0 2 # 4 4 0 0 1 </code></pre>

R: Find missing columns, add to data frame if missing

Tags:

r

I'd like to write some code that would take a given data frame, check to see if any columns are missing, and if so, add the missing columns filled with 0 or NA. Here's what I've got:

> df
   x1 x2 x4
1   0  1  3
2   3  1  3
3   1  2  1

> nameslist <- c("x1","x2","x3","x4")
> miss.names <- !nameslist %in% colnames(df)
> holder <- rbind(nameslist,miss.names)
> miss.cols <- subset(holder[1,], holder[2,] == "TRUE")

Beyond this point, I can't figure out how to add in the missing column ("x3") without hardcoding it. Ideally, I'd want the new, complete data frame to have columns in the same order as nameslist as well.

Any ideas? My current code can be ignored, no problem.

521

asked Feb 11 '12 01:02

bosbmgatl

1 Answers

Here's a straightforward approach

df <- data.frame(a=1:4, e=4:1)
nms <- c("a", "b", "d", "e")   # Vector of columns you want in this data.frame

Missing <- setdiff(nms, names(df))  # Find names of missing columns
df[Missing] <- 0                    # Add them, filled with '0's
df <- df[nms]                       # Put columns in desired order
#   a b d e
# 1 1 0 0 4
# 2 2 0 0 3
# 3 3 0 0 2
# 4 4 0 0 1

answered Nov 10 '22 07:11

Josh O'Brien

Related questions
                            
                                IF "OR" multiple conditions
                            
                                Manipulation of Large Files in R
                            
                                R: ggfortify: "Objects of type prcomp not supported by autoplot"
                            
                                add quotation mark to a vector in R [duplicate]
                            
                                R - How to re-order row index number
                            
                                Running count based on field in R
                            
                                Lower case for a (factor) data frame column
                            
                                Extract distinct characters that differ between two strings
                            
                                Shiny + CSS: Aligning actionButtons in shinydashboard sidebar
                            
                                What does the span argument control in geom_smooth?
                            
                                Extract date from given string in r
                            
                                How to plot a Stacked and grouped bar chart in ggplot?
                            
                                How to italicize one category in a legend in ggplot2
                            
                                Different colors with gradient for subgroups on a treemap ggplot2 R
                            
                                How to center boxes on top of lines in the legend of a plot?
                            
                                R: Create duplicate rows based on a variable (dplyr preferred) [duplicate]
                            
                                Check if column value is in between (range) of two other column values
                            
                                How to summarize a list of combination
                            
                                Paste together two data frames element by element in R
                            
                                Saving a list of plots by their names()

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With