I am trying to reorganize my data, basically a list of data.frames. Its elements represent subjects of interest (A and B), with observations on x and y, collected on two occasions (1 and 2). I am trying to make this a list that contains data.frames referring to the subjects, with the information on which occasion x and y were collected being stored in the respective data.frames as new variable, as opposed to the element name:
library('rlist')
A1 <- data.frame(x=sample(1:100,2),y=sample(1:100,2))
A2 <- data.frame(x=sample(1:100,2),y=sample(1:100,2))
B1 <- data.frame(x=sample(1:100,2),y=sample(1:100,2))
B2 <- data.frame(x=sample(1:100,2),y=sample(1:100,2))
list <- list(A1=A1,A2=A2,B1=B1,B2=B2)
A <- do.call(rbind,list.match(list,"A"))
B <- do.call(rbind,list.match(list,"B"))
list <- list(A=A,B=B)
list <- lapply(list,function(x) {
y <- data.frame(x)
y$class <- c(rep.int(1,2),rep.int(2,2))
return(y)
})
> list
$A
x y class
A1.1 66 96 1
A1.2 76 58 1
A2.1 50 93 2
A2.2 57 12 2
$B
x y class
B1.1 58 56 1
B1.2 69 15 1
B2.1 77 77 2
B2.2 9 9 2
In my real world problem there are about 500 subjects, not always two occasions, differing numbers of observations.
So my example above is just to illustrate where I want to get, and I am stuck at how to pass to the do.call-rbind that it should, based on elements names, bind subject-specific elements as new list elements together, while assigning a new variable.
To me, this is a somewhat fuzzy task, and the closest I got was the rlist package. This question is related but uses unique to identify elements, whereas in my case it seems to be more a regex problem.
I'd be happy even for instructions on how to use google, any keywords for further research etc.
From the data you provided:
subj <- sub("[A-Z]*", "", names(lst))
newlst <- Map(function(x, y) {x[,"class"] <- y;x}, lst, subj)
First we do the regular expression call to isolate the number that will go in the class column. In this case, I matched on capital letters and erased them leaving the number. Therefore, "A1" becomes "1". Please note that the real names will mean a different regex pattern.
Then we use Map to create a new column for each data frame and save to a new list called newlst. Map takes the first element of each argument and carries out the function then continues on with each object element. So the first data frame in lst and the first number in subj are used first. The anonymous function I used is function(x,y) {x[, "class"] <- y; x}. It takes two arguments. The first is the data frame, the second is the column value.
Now it's much easier to move forward. We can create a vector called uniq.nmes to get the names of the data frames that we will combine. Where "A1" will become "A". Then we can rbind on that match:
uniq.nmes <- unique(sub("\\d", "", names(lst)))
lapply(uniq.nmes, function(x) {
do.call(rbind, newlst[grep(x, names(newlst))])
})
# [[1]]
# x y class
# A1.1 1 79 1
# A1.2 30 13 1
# A2.1 90 39 2
# A2.2 43 22 2
#
# [[2]]
# x y class
# B1.1 54 59 1
# B1.2 83 90 1
# B2.1 85 36 2
# B2.2 91 28 2
Data
A1 <- data.frame(x=sample(1:100,2),y=sample(1:100,2))
A2 <- data.frame(x=sample(1:100,2),y=sample(1:100,2))
B1 <- data.frame(x=sample(1:100,2),y=sample(1:100,2))
B2 <- data.frame(x=sample(1:100,2),y=sample(1:100,2))
lst <- list(A1=A1,A2=A2,B1=B1,B2=B2)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With