I know this has been asked numerous times on here under the rubric of "long to wide" but I've run into a situation where I have two value variables that are repeated measures.
id sex time score1 score2
1 subject 1 m Time1 -0.20926263 0.2499310
2 subject 2 m Time1 0.17147511 3.2708905
3 subject 3 m Time1 -0.82619584 0.5993917
4 subject 4 f Time1 -0.95568823 4.4729726
5 subject 5 f Time1 -2.29939525 8.0101254
6 subject 1 m Time2 -0.37914702 3.6387589
7 subject 2 m Time2 0.26759909 4.9027533
8 subject 3 m Time2 0.07727621 2.1848642
9 subject 4 f Time2 -0.08613439 5.8747074
10 subject 5 f Time2 -0.02743044 4.3963938
11 subject 1 m Time3 0.07176053 3.7959496
12 subject 2 m Time3 0.46463917 5.2494579
13 subject 3 m Time3 -0.68764512 2.2639503
14 subject 4 f Time3 -0.56670061 2.3361909
15 subject 5 f Time3 1.70731774 5.8345116
Quick way to reproduce the data frame (DF).
DF<-data.frame(id=rep(paste("subject", 1:5, sep=" "), 3),
sex=rep(c("m","m","m","f","f"), 3),
time=c(rep("Time1",5), rep("Time2",5), rep("Time3",5)),
score1=rnorm(15), score2=abs(rnorm(15)*4))
I can solve the issue of long to wide for two measured repeated measure variables using the reshape
function from base but I was hoping for a plyr
or reshape2/1
answer, as these packages are generally much more intuitive to me. If you have any other solutions go ahead and provide them as the learning would be great.
Solution from base:
wide <- reshape(DF, v.names=c("score1", "score2"), idvar="id",
timevar="time", direction="wide")
wide
A dataset can be written in two different formats: wide and long. A wide format contains values that do not repeat in the first column. A long format contains values that do repeat in the first column.
We can use the SPSS command varstocases to reshape the data from wide to long format. The /make subcommand is used to create the new variables in the long data set from the old variables in the wide data set. The first variable name given on that subcommand, in this case, ht, is the name of the new variable.
Wide data has a column for each variable. Whereas long format data has a column for possible variable types & a column for the values of those variables.
When there are multiple measurements of the same subject, across time or using different tools, the data is often described as being in "wide" format if there is one observation row per subject with each measurement present as a different variable and "long" format if there is one observation row per measurement (thus, ...
I think this will do it:
library(reshape)
m <- melt(DF)
Simplest, but time and score are in the opposite order from your example (in case it matters)
cast(m,id+sex~...)
Or more explicitly:
cast(m,id+sex~variable+time)
You can cut this down to a one-liner:
recast(DF,id+sex~...)
If you like you can use the newer reshape2
package instead of reshape
, replacing cast
with dcast
(the version of recast
included in reshape2
doesn't give the desired result.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With