I have searched extensively but not found an answer to this question on Stack Overflow.
Lets say I have a data frame a.
I define:
a <- NULL a <- as.data.frame(a)
If I wanted to add a column to this data frame as so:
a$col1 <- c(1,2,3)
I get the following error:
Error in `$<-.data.frame`(`*tmp*`, "a", value = c(1, 2, 3)) : replacement has 3 rows, data has 0
Why is the row dimension fixed but the column is not?
How do I change the number of rows in a data frame?
If I do this (inputting the data into a list first and then converting to a df), it works fine:
a <- NULL a$col1 <- c(1,2,3) a <- as.data.frame(a)
The easiest way to add an empty column to a dataframe in R is to use the add_column() method: dataf %>% add_column(new_col = NA) . Note, that this includes installing dplyr or tidyverse.
If you want to create an empty data. frame with dynamic names (colnames in a variable), this can help: names <- c("v","u","w") df <- data. frame() for (k in names) df[[k]]<-as. numeric() You can change the type as well if you need so.
The row dimension is not fixed, but data.frames are stored as list of vectors that are constrained to have the same length. You cannot add col1
to a
because col1
has three values (rows) and a
has zero, thereby breaking the constraint. R does not by default auto-vivify values when you attempt to extend the dimension of a data.frame by adding a column that is longer than the data.frame. The reason that the second example works is that col1
is the only vector in the data.frame so the data.frame is initialized with three rows.
If you want to automatically have the data.frame expand, you can use the following function:
cbind.all <- function (...) { nm <- list(...) nm <- lapply(nm, as.matrix) n <- max(sapply(nm, nrow)) do.call(cbind, lapply(nm, function(x) rbind(x, matrix(, n - nrow(x), ncol(x))))) }
This will fill missing values with NA
. And you would use it like: cbind.all( df, a )
You could also do something like this where I read in data from multiple files, grab the column I want, and store it in the dataframe. I check whether the dataframe has anything in it, and if it doesn't, create a new one rather than getting the error about mismatched number of rows:
readCounts = data.frame() for(f in names(files)){ d = read.table(files[f], header=T, as.is=T) d2 = round(data.frame(d$NumReads)) colnames(d2) = f if(ncol(readCounts) == 0){ readCounts = d2 rownames(readCounts) = d$Name } else{ readCounts = cbind(readCounts, d2) } }
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With