Dear StackOverFlowers (flowers in short),
I have a list of data.frames (walk.sample) that I would like to collapse into a single (giant) data.frame. While collapsing, I would like to mark (adding another column) which rows have came from which element of the list. This is what I've got so far.
This is the data.frame that needs to be collapsed/stacked.
> walk.sample
[[1]]
walker x y
1073 3 228.8756 -726.9198
1086 3 226.7393 -722.5561
1081 3 219.8005 -728.3990
1089 3 225.2239 -727.7422
1032 3 233.1753 -731.5526
[[2]]
walker x y
1008 3 205.9104 -775.7488
1022 3 208.3638 -723.8616
1072 3 233.8807 -718.0974
1064 3 217.0028 -689.7917
1026 3 234.1824 -723.7423
[[3]]
[1] 3
[[4]]
walker x y
546 2 629.9041 831.0852
524 2 627.8698 873.3774
578 2 572.3312 838.7587
513 2 633.0598 871.7559
538 2 636.3088 836.6325
1079 3 206.3683 -729.6257
1095 3 239.9884 -748.2637
1005 3 197.2960 -780.4704
1045 3 245.1900 -694.3566
1026 3 234.1824 -723.7423
I have written a function to add a column that denote from which element the rows came followed by appending it to an existing data.frame.
collapseToDataFrame <- function(x) { # collapse list to a dataframe with a twist
walk.df <- data.frame()
for (i in 1:length(x)) {
n.rows <- nrow(x[[i]])
if (length(x[[i]])>1) {
temp.df <- cbind(x[[i]], rep(i, n.rows))
names(temp.df) <- c("walker", "x", "y", "session")
walk.df <- rbind(walk.df, temp.df)
} else {
cat("Empty list", "\n")
}
}
return(walk.df)
}
> collapseToDataFrame(walk.sample)
Empty list
Empty list
walker x y session
3 1 -604.5055 -123.18759 1
60 1 -562.0078 -61.24912 1
84 1 -594.4661 -57.20730 1
9 1 -604.2893 -110.09168 1
43 1 -632.2491 -54.52548 1
1028 3 240.3905 -724.67284 1
1040 3 232.5545 -681.61225 1
1073 3 228.8756 -726.91980 1
1091 3 209.0373 -740.96173 1
1036 3 248.7123 -694.47380 1
I'm curious whether this can be done more elegantly, with perhaps do.call() or some other more generic function?
I think this will work...
lengths <- sapply(walk.sample, function(x) if (is.null(nrow(x))) 0 else nrow(x))
cbind(do.call(rbind, walk.sample[lengths > 1]),
session = rep(1:length(lengths), ifelse(lengths > 1, lengths, 0)))
I'm not claiming this to be the most elegant approach, but I think it is working
library(plyr)
ldply(sapply(1:length(walk.sample), function(i)
if (length(walk.sample[[i]]) > 1)
cbind(walk.sample[[i]],session=rep(i,nrow(walk.sample[[i]])))
),rbind)
EDIT
After applying Marek's apt remarks
do.call(rbind,lapply(1:length(walk.sample), function(i)
if (length(walk.sample[[i]]) > 1)
cbind(walk.sample[[i]],session=i) ))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With