Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

swapping rows and ordering in a data frame

Tags:

sorting

r

I found it a bit tricky to sort some rows in my data frame in R. The data has the following structure.

df <- 
ID   TIME   DVID    AMT   DV    PERIOD
 1    0      0      50    NA    1 
 1    0.5    1      0     10    1
 1    0.5    2      0     15    1
 2    600    0      100   NA    2
 2    600    1      0     2.5   2
 2    600    2      0     4     2
 2    602    1      0     20    2
 2    602    2      0     35    2
 3    800    0      50    NA    3
 3    800    1      0     5     3
 3    800    2      0     10    3

For PERIOD >= 2 I want to swap rows for DVID to be DVID 1, 2, then 0 as shown in the output data frame below dfout. The data has to be ordered by ID, TIME, PERIOD as well.

I tried using the orderBy but could reach my goal. An example of using orderBy is this.

library(doBy)
dfout <- orderBy(~ID+TIME+PERIOD, data=df)

The output should be this:

dfout <- 
ID   TIME   DVID    AMT   DV    PERIOD
 1    0      0      50    NA    1 
 1    0.5    1      0     10    1
 1    0.5    2      0     15    1
 2    600    1      0     2.5   2
 2    600    2      0     4     2
 2    600    0      100   NA    2
 2    602    1      0     20    2
 2    602    2      0     35    2
 3    800    1      0     5     3
 3    800    2      0     10    3
 3    800    0      50    NA    3

I would appreciate if somebody share a trick on how to do that in R.

like image 234
Amer Avatar asked Jan 05 '23 21:01

Amer


2 Answers

We can do this without any group by and by doing that it should be faster.

library(dplyr)
df %>%
    arrange(ID, PERIOD, TIME, replace(DVID, PERIOD > 1 & DVID==0, 3))
#     ID  TIME DVID AMT   DV PERIOD
#1   1   0.0    0  50   NA      1
#2   1   0.5    1   0 10.0      1
#3   1   0.5    2   0 15.0      1
#4   2 600.0    1   0  2.5      2
#5   2 600.0    2   0  4.0      2
#6   2 600.0    0 100   NA      2
#7   2 602.0    1   0 20.0      2
#8   2 602.0    2   0 35.0      2
#9   3 800.0    1   0  5.0      3
#10  3 800.0    2   0 10.0      3
#11  3 800.0    0  50   NA      3

Or using data.table

library(data.table)
setDT(df)[PERIOD > 1 & DVID==0, DVID1:= 3][order(ID, TIME, PERIOD, 
                      pmax(DVID, DVID1, na.rm=TRUE))][, DVID1 := NULL][]

Or using base R

df[order(df$ID, df$PERIOD, df$TIME, with(df, replace(DVID,PERIOD>1 & DVID ==0, 3))),]
#   ID  TIME DVID AMT   DV PERIOD
#1   1   0.0    0  50   NA      1
#2   1   0.5    1   0 10.0      1
#3   1   0.5    2   0 15.0      1
#5   2 600.0    1   0  2.5      2
#6   2 600.0    2   0  4.0      2
#4   2 600.0    0 100   NA      2
#7   2 602.0    1   0 20.0      2
#8   2 602.0    2   0 35.0      2
#10  3 800.0    1   0  5.0      3
#11  3 800.0    2   0 10.0      3
#9   3 800.0    0  50   NA      3
like image 66
akrun Avatar answered Jan 15 '23 00:01

akrun


Here is a working option by specifying the 0 to be 3 when Period is larger than 1 and then order by the DVID group by PERIOD, ID and TIME using data.table package:

library(data.table)
setDT(df)
df[,.SD[order(ifelse(PERIOD > 1 & DVID == 0, 3, DVID))], .(PERIOD, ID, TIME)]

#    PERIOD ID  TIME DVID AMT   DV
# 1:      1  1   0.0    0  50   NA
# 2:      1  1   0.5    1   0 10.0
# 3:      1  1   0.5    2   0 15.0
# 4:      2  2 600.0    1   0  2.5
# 5:      2  2 600.0    2   0  4.0
# 6:      2  2 600.0    0 100   NA
# 7:      2  2 602.0    1   0 20.0
# 8:      2  2 602.0    2   0 35.0
# 9:      3  3 800.0    1   0  5.0
#10:      3  3 800.0    2   0 10.0
#11:      3  3 800.0    0  50   NA

Another way to avoid the ifelse statement:

df[,.SD[order(((PERIOD > 1)*2 + DVID)%%3)], .(PERIOD, ID, TIME)]

 #    PERIOD ID  TIME DVID AMT   DV
 # 1:      1  1   0.0    0  50   NA
 # 2:      1  1   0.5    1   0 10.0
 # 3:      1  1   0.5    2   0 15.0
 # 4:      2  2 600.0    1   0  2.5
 # 5:      2  2 600.0    2   0  4.0
 # 6:      2  2 600.0    0 100   NA
 # 7:      2  2 602.0    1   0 20.0
 # 8:      2  2 602.0    2   0 35.0
 # 9:      3  3 800.0    1   0  5.0
# 10:      3  3 800.0    2   0 10.0
# 11:      3  3 800.0    0  50   NA
like image 23
Psidom Avatar answered Jan 15 '23 00:01

Psidom