I have many data frames named repeatably:
df.1 <- data.frame("x"=c(1,2), "y"=2)
df.2 <- data.frame("x"=c(2,4), "y"=4)
df.3 <- data.frame("x"=2, "y"=c(4,5))
All data frames have the same number of rows and columns. I want to bind them, adding a column with the id of the data frame. The id would be the name of the source data frame.
I know I could do this manually:
rbind(data.frame(id = "df.1", df.1),
data.frame(id = "df.2", df.2),
data.frame(id = "df.3", df.3))
But there's a lot of them and their number will change in the future.
I tried writing for loops but they didn't work. I suppose that's because I'm basing them on a list of strings containing data frames' names rather than a list of data frames themselves.
df_names <- ls(pattern = "df.\\d+")
for (i in df_names) {
i$id <- i
i
}
...but I also haven't found any automated way of creating a list of data frames with repeatable names. And even if I do, I'm not that sure the for-loop above would work :)
When column-binding, rows are matched by position, so all data frames must have the same number of rows. To match by value, not position, see mutate-joins. Data frame identifier. When .id is supplied, a new column of identifiers is created to link each row to its original data frame. The labels are taken from the named arguments to bind_rows ().
The pictographical representation of column bind operation is shown below. It is simple concatenate of the two or more tables on column wise. Note : The number of rows in two dataframes needs to be same for both cbind () function and bind_cols () function.
Data frames to combine. Each argument can either be a data frame, a list that could be a data frame, or a list of data frames. When row-binding, columns are matched by name, and any missing columns will be filled with NA. When column-binding, rows are matched by position, so all data frames must have the same number of rows.
In Pandas, we have the freedom to add columns in the data frame whenever needed. There are multiple ways to add columns to the Pandas data frame. Method 2: Add multiple columns to a data frame using Dataframe.assign () method
You could use parse
and eval
to get the data frames from df_names
:
do.call(rbind, lapply(df_names, function(x){data.frame(id=x, eval(parse(text=x)))}))
id x y
1 df.1 1 2
2 df.1 2 2
3 df.2 2 4
4 df.2 4 4
5 df.3 2 4
6 df.3 2 5
There is also combine
from the "gdata" package:
library(gdata)
combine(df.1, df.2, df.3)
# x y source
# 1 1 2 df.1
# 2 2 2 df.1
# 3 2 4 df.2
# 4 4 4 df.2
# 5 2 4 df.3
# 6 2 5 df.3
Another approach using mget
:
dat <- do.call(rbind, mget(df_names))
dat$id <- sub("\\.\\d+$", "", rownames(dat))
# x y id
# df.1.1 1 2 df.1
# df.1.2 2 2 df.1
# df.2.1 2 4 df.2
# df.2.2 4 4 df.2
# df.3.1 2 4 df.3
# df.3.2 2 5 df.3
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With