Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

do.call("rbind", list(data, frames)) but also index each row by its original data frame [duplicate]

Tags:

r

dplyr

df1 <- data.frame(a = 1:2, b = 3:4)
df2 <- data.frame(a = 5:6, b = 7:8)

# A common method loses the origin of each row.
do.call("rbind", list(df1, df2))
##   a b
## 1 1 3
## 2 2 4
## 3 5 7
## 4 6 8

# Whereas here, X1 records which data frame each row originated in.
library(plyr)
adply(list(df1, df2), 1)
##   X1 a b
## 1  1 1 3
## 2  1 2 4
## 3  2 5 7
## 4  2 6 8

Are there any other ways to do this, perhaps more efficient?

like image 737
nacnudus Avatar asked Oct 20 '22 19:10

nacnudus


2 Answers

Here is one way.

library(dplyr)
library(tidyr)

foo <- list(df1, df2)

unnest(foo, names) %>%
mutate(names = gsub("^X", "", names))

#  names a b
#1     1 1 3
#2     1 2 4
#3     2 5 7
#4     2 6 8
like image 150
jazzurro Avatar answered Oct 22 '22 19:10

jazzurro


With base:

df1 <- data.frame(a = 1:2, b = 3:4)
df2 <- data.frame(a = 5:6, b = 7:8)

frames <- list(df1, df2)

do.call(rbind, lapply(seq_along(frames), function(x) {
  frames[[x]]$X1 <- x
  frames[[x]]
}))

##   a b X1
## 1 1 3  1
## 2 2 4  1
## 3 5 7  2
## 4 6 8  2

As an aside, if you want to see how plyr does this have a gander at (plyr::adply), (plyr:::splitter_a) & (plyr::ldply). These answers are trivial compared to that :-)

like image 35
hrbrmstr Avatar answered Oct 22 '22 20:10

hrbrmstr