Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reshaping a data frame with more than one measure variable

Tags:

I'm using a data frame similar to this one:

df<-data.frame(student=c(rep(1,5),rep(2,5)), month=c(1:5,1:5),         quiz1p1=seq(20,20.9,0.1),quiz1p2=seq(30,30.9,0.1),         quiz2p1=seq(80,80.9,0.1),quiz2p2=seq(90,90.9,0.1))        print(df)       student month quiz1p1 quiz1p2 quiz2p1 quiz2p2   1     1     1    20.0    30.0    80.0    90.0   2     1     2    20.1    30.1    80.1    90.1   3     1     3    20.2    30.2    80.2    90.2   4     1     4    20.3    30.3    80.3    90.3 5     1     5    20.4    30.4    80.4    90.4 6     2     1    20.5    30.5    80.5    90.5 7     2     2    20.6    30.6    80.6    90.6 8     2     3    20.7    30.7    80.7    90.7 9     2     4    20.8    30.8    80.8    90.8 10    2     5    20.9    30.9    80.9    90.9 

Describing grades received by students during five months – in two quizzes divided into two parts each.

I need to get the two quizzes into separate rows – so that each student in each month will have two rows, one for each quiz, and two columns – for each part of the quiz. When I melt the table:

melt.data.frame(df, c("student", "month")) 

I get the two parts of the quiz in separate lines too.

dcast(dfL,student+month~variable) 

of course gets me right back where I started, and I can't find a way to cast the table back in to the required form. Is there a way to make the melt command function something like:

melt.data.frame(df, measure.var1=c("quiz1p1","quiz2p1"),                  measure.var2=c("quiz1p2","quiz2p2"))   
like image 270
eli-k Avatar asked Oct 11 '12 10:10

eli-k


1 Answers

Here's how you could do this with reshape(), from base R:

df2 <- reshape(df, direction="long",                idvar = 1:2, varying = list(c(3,5), c(4,6)),                v.names = c("p1", "p2"), times = c("quiz1", "quiz2"))  ## Checking the output     rbind(head(df2, 3), tail(df2, 3)) #           student month  time   p1   p2 # 1.1.quiz1       1     1 quiz1 20.0 30.0 # 1.2.quiz1       1     2 quiz1 20.1 30.1 # 1.3.quiz1       1     3 quiz1 20.2 30.2 # 2.3.quiz2       2     3 quiz2 80.7 90.7 # 2.4.quiz2       2     4 quiz2 80.8 90.8 # 2.5.quiz2       2     5 quiz2 80.9 90.9 

You can also use column names (instead of column numbers) for idvar and varying. It's more verbose, but seems like better practice to me:

## The same operation as above, using just column *names* df2 <- reshape(df, direction="long", idvar=c("student", "month"),                varying = list(c("quiz1p1", "quiz2p1"),                                c("quiz1p2", "quiz2p2")),                 v.names = c("p1", "p2"), times = c("quiz1", "quiz2")) 
like image 75
Josh O'Brien Avatar answered Sep 20 '22 05:09

Josh O'Brien