Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

reshaping a data frame into long format in R

Tags:

r

reshape

I'm struggling with a reshape in R. I have 2 types of error (err and rel_err) that have been calculated for 3 different models. This gives me a total of 6 error variables (i.e. err_1, err_2, err_3, rel_err_1, rel_err_2, and rel_err_3). For each of these types of error I have 3 different types of predivtive validity tests (ie random holdouts, backcast, forecast). I would like to make my data set long so I keep the 4 types of test long while also making the two error measurements long. So in the end I will have one variable called err and one called rel_err as well as an id variable for what model the error corresponds to (1,2,or 3)

Here is my data right now:

iter       err_1  rel_err_1      err_2  rel_err_2      err_3  rel_err_3 test_type
1 -0.09385732 -0.2235443 -0.1216982 -0.2898543 -0.1058366 -0.2520759    random
1  0.16141630  0.8575728  0.1418732  0.7537442  0.1584816  0.8419816    back
1  0.16376930  0.8700738  0.1431505  0.7605302  0.1596502  0.8481901    front
1  0.14345986  0.6765194  0.1213689  0.5723444  0.1374676  0.6482615    random
1  0.15890059  0.7435382  0.1589823  0.7439204  0.1608709  0.7527580    back
1  0.14412360  0.6743928  0.1442039  0.6747684  0.1463520  0.6848202    front

and here is what I would like it to look like:

iter     model    err           rel_err    test_type
1        1        -0.09385732    (#'s)     random
1        2        -0.1216982     (#'s)     random
1        3        -0.1216982     (#'s)     random

and on...

I've tried playing around with the syntax but can't quite figure out what to put for the time.varying argument

Thanks very much for any help you can offer.

like image 346
user1773115 Avatar asked Jan 15 '23 05:01

user1773115


1 Answers

You could do it the "hard" way. For transparency you can use names.

with( dat, data.frame(iter = rep(iter, 3), 
      model = rep(1:3, each = nrow(dat)),
      err = c(err_1, err_2, err_3), 
      rel_err = c(rel_err_1, rel_err_2, rel_err_3), 
      test_type = rep(test_type, 3)) )

Or, for conciseness, indexes.

data.frame(iter = dat[,1], model = rep(1:3, each = nrow(dat)), err = dat[,c(2, 4, 6)], 
          rel_err = dat[,c(3, 5, 7)], test_type = dat[,8]) )

If you had a LOT of columns the hard way might involve grepping the column names.

This "hard" way was about as concise as reshape and required less thinking about how to use the commands. Sometimes I just skip thinking about reshape.

like image 108
John Avatar answered Jan 26 '23 01:01

John