Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

reshape wide to long using data.table with multiple columns

Tags:

r

data.table

I have a dataframe in a wide format, like below. i want to reshape the wide to long using data.table melt function.in simple case, i can split two data, and rbind two datasets. but in my case, there are multiple test(i) testgr(i) columns. But there must be a better and a more efficient way to do this. thx in advance.

from =>

id<-c("106E1258","106E2037","104E1182","105E1248","105E1470","10241247",
"10241703")
yr<-c(2017,2017,2015,2016,2016,2013,2013)
finalgr<-c(72,76,75,71,75,77,78)
test01<-c("R0560","R0066","R0308","R0129","R0354","R0483",  
"R0503")
test01gr<-c(73,74,67,80,64,80,70)
test02<-c("R0660","R0266","R0302","R0139","R0324","R0383"   ,
"R0503")
test02gr<-c(71,54,67,70,68,81,61)
dt<-data.frame(id=id,yr=yr,
finalgr=finalgr,
test01=test01,test01gr=test01gr,
test02=test02,test02gr=test02gr)

to=>

id2<-c("106E1258","106E1258","104E1182","104E1182")
yr2<-c(2017,2017,2015,2015)
finalgr<-c(72,72,75,75)
testid<-c("R0560","R0660","R0308","R0302")
testgr<-c(73,71,67,67)
dt2<-data.frame(id=id2,yr=yr2,finalgr=finalgr,testid=testid,testgr=testgr)
like image 237
changjx Avatar asked Jan 15 '18 14:01

changjx


1 Answers

You indeed should use melt:

setDT(dt)
melt(dt, id.vars = c('id', 'yr', 'finalgr'), 
     measure.vars = list(testid = c('test01', 'test02'),
                         testgr = c('test01gr', 'test02gr')))
#           id   yr finalgr variable testid testgr
#  1: 106E1258 2017      72        1  R0560     73
#  2: 106E2037 2017      76        1  R0066     74
#  3: 104E1182 2015      75        1  R0308     67
#  4: 105E1248 2016      71        1  R0129     80
#  5: 105E1470 2016      75        1  R0354     64
#  6: 10241247 2013      77        1  R0483     80
#  7: 10241703 2013      78        1  R0503     70
#  8: 106E1258 2017      72        2  R0660     71
#  9: 106E2037 2017      76        2  R0266     54
# 10: 104E1182 2015      75        2  R0302     67
# 11: 105E1248 2016      71        2  R0139     70
# 12: 105E1470 2016      75        2  R0324     68
# 13: 10241247 2013      77        2  R0383     81
# 14: 10241703 2013      78        2  R0503     61

If there are many more test columns, you can use patterns:

melt(dt, id.vars = c('id', 'yr', 'finalgr'), 
     measure.vars = patterns(testid = 'test[0-9]+$', testgr = 'test[0-9]+gr'))
like image 195
MichaelChirico Avatar answered Oct 19 '22 15:10

MichaelChirico