Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

bind many data frames adding a column with their id [duplicate]

Tags:

dataframe

r

I have many data frames named repeatably:

df.1 <- data.frame("x"=c(1,2), "y"=2)
df.2 <- data.frame("x"=c(2,4), "y"=4)
df.3 <- data.frame("x"=2, "y"=c(4,5))

All data frames have the same number of rows and columns. I want to bind them, adding a column with the id of the data frame. The id would be the name of the source data frame.

I know I could do this manually:

rbind(data.frame(id = "df.1", df.1),
      data.frame(id = "df.2", df.2),
      data.frame(id = "df.3", df.3))

But there's a lot of them and their number will change in the future.

I tried writing for loops but they didn't work. I suppose that's because I'm basing them on a list of strings containing data frames' names rather than a list of data frames themselves.

df_names <- ls(pattern = "df.\\d+")

for (i in df_names) {
  i$id <- i
  i
}

...but I also haven't found any automated way of creating a list of data frames with repeatable names. And even if I do, I'm not that sure the for-loop above would work :)

like image 726
Kuba Krukar Avatar asked Feb 01 '14 20:02

Kuba Krukar


People also ask

How does column-binding work with data frames?

When column-binding, rows are matched by position, so all data frames must have the same number of rows. To match by value, not position, see mutate-joins. Data frame identifier. When .id is supplied, a new column of identifiers is created to link each row to its original data frame. The labels are taken from the named arguments to bind_rows ().

How to bind two tables in a Dataframe?

The pictographical representation of column bind operation is shown below. It is simple concatenate of the two or more tables on column wise. Note : The number of rows in two dataframes needs to be same for both cbind () function and bind_cols () function.

How do you combine data frames in SQL?

Data frames to combine. Each argument can either be a data frame, a list that could be a data frame, or a list of data frames. When row-binding, columns are matched by name, and any missing columns will be filled with NA. When column-binding, rows are matched by position, so all data frames must have the same number of rows.

How to add multiple columns to a pandas Dataframe?

In Pandas, we have the freedom to add columns in the data frame whenever needed. There are multiple ways to add columns to the Pandas data frame. Method 2: Add multiple columns to a data frame using Dataframe.assign () method


3 Answers

You could use parse and eval to get the data frames from df_names:

do.call(rbind, lapply(df_names, function(x){data.frame(id=x, eval(parse(text=x)))}))


    id x y
1 df.1 1 2
2 df.1 2 2
3 df.2 2 4
4 df.2 4 4
5 df.3 2 4
6 df.3 2 5
like image 192
user1981275 Avatar answered Oct 29 '22 23:10

user1981275


There is also combine from the "gdata" package:

library(gdata)
combine(df.1, df.2, df.3)
#   x y source
# 1 1 2   df.1
# 2 2 2   df.1
# 3 2 4   df.2
# 4 4 4   df.2
# 5 2 4   df.3
# 6 2 5   df.3
like image 36
A5C1D2H2I1M1N2O1R2T1 Avatar answered Oct 30 '22 00:10

A5C1D2H2I1M1N2O1R2T1


Another approach using mget:

dat <- do.call(rbind, mget(df_names))
dat$id <- sub("\\.\\d+$", "", rownames(dat))

#        x y   id
# df.1.1 1 2 df.1
# df.1.2 2 2 df.1
# df.2.1 2 4 df.2
# df.2.2 4 4 df.2
# df.3.1 2 4 df.3
# df.3.2 2 5 df.3
like image 30
Sven Hohenstein Avatar answered Oct 29 '22 22:10

Sven Hohenstein